Optimal Near-End Speech Intelligibility Improvement Using CLPSO-Based Voice Transformation in Realistic Noisy Environments

Biswas R.; Nathwani K.

Full metadata record

DC Field	Value	Language
dc.contributor.author	Biswas R.	en_US
dc.contributor.author	Nathwani K.	en_US
dc.date.accessioned	2023-11-30T08:33:16Z	-
dc.date.available	2023-11-30T08:33:16Z	-
dc.date.issued	2022	-
dc.identifier.issn	0278081X	-
dc.identifier.other	EID(2-s2.0-85134676071)	-
dc.identifier.uri	https://dx.doi.org/10.1007/s00034-022-02106-3	-
dc.identifier.uri	http://localhost:8080/xmlui/handle/123456789/433	-
dc.description.abstract	The proposed work attempts to improve the near-end intelligibility of speech at very low signal-to-noise ratios (SNRs). Additionally, the prerequisite of noise statistics that existing intelligibility improvement methods require is not a limitation of the proposed approach. To this end, the shaping parameters of the voice transformation function (VTF) are optimized. This optimization of the shaping parameters of the VTF corresponds to the combined modification that includes formant shifting, nonuniform time scaling, smoothing, and energy re-distributions in comprehensive learning particle swarm optimization (CLPSO) framework. The optimal parameters of the combined modifications are obtained by jointly maximizing the short time objective intelligibility, perceptual evaluation of speech quality and signal-to-distortion ratio metrics being used as the cost function in CLPSO. The outcome at the end is an improvement in intelligibility that is significantly higher than the ones obtained by applying these methods individually, while preserving the quality. As a side result, a Gaussian process regression is also employed to estimate the shaping parameters of VTF at arbitrary SNRs—other than the ones which were used during CLPSO training. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.	en_US
dc.language.iso	en	en_US
dc.publisher	Birkhauser	en_US
dc.source	Circuits, Systems, and Signal Processing	en_US
dc.subject	CLPSO	en_US
dc.subject	PESQ	en_US
dc.subject	SDR	en_US
dc.subject	Speech intelligibility	en_US
dc.subject	STOI	en_US
dc.title	Optimal Near-End Speech Intelligibility Improvement Using CLPSO-Based Voice Transformation in Realistic Noisy Environments	en_US
dc.type	Journal Article	en_US
Appears in Collections:	Journal Article