Skip navigation

Please use this identifier to cite or link to this item: http://10.10.120.238:8080/xmlui/handle/123456789/433
Title: Optimal Near-End Speech Intelligibility Improvement Using CLPSO-Based Voice Transformation in Realistic Noisy Environments
Authors: Biswas R.
Nathwani K.
Keywords: CLPSO
PESQ
SDR
Speech intelligibility
STOI
Issue Date: 2022
Publisher: Birkhauser
Abstract: The proposed work attempts to improve the near-end intelligibility of speech at very low signal-to-noise ratios (SNRs). Additionally, the prerequisite of noise statistics that existing intelligibility improvement methods require is not a limitation of the proposed approach. To this end, the shaping parameters of the voice transformation function (VTF) are optimized. This optimization of the shaping parameters of the VTF corresponds to the combined modification that includes formant shifting, nonuniform time scaling, smoothing, and energy re-distributions in comprehensive learning particle swarm optimization (CLPSO) framework. The optimal parameters of the combined modifications are obtained by jointly maximizing the short time objective intelligibility, perceptual evaluation of speech quality and signal-to-distortion ratio metrics being used as the cost function in CLPSO. The outcome at the end is an improvement in intelligibility that is significantly higher than the ones obtained by applying these methods individually, while preserving the quality. As a side result, a Gaussian process regression is also employed to estimate the shaping parameters of VTF at arbitrary SNRs—other than the ones which were used during CLPSO training. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
URI: https://dx.doi.org/10.1007/s00034-022-02106-3
http://localhost:8080/xmlui/handle/123456789/433
ISSN: 0278081X
Appears in Collections:Journal Article

Files in This Item:
There are no files associated with this item.
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.