Skip navigation

Please use this identifier to cite or link to this item: http://10.10.120.238:8080/xmlui/handle/123456789/132
Title: Transfer learning for speech intelligibility improvement in noisy environments
Authors: Biswas R.
Nathwani K.
Abrol V.
Keywords: Formant ratio
Formant shifting
Pitch ratio
Speech intelligibility
Transfer learning
Issue Date: 2021
Publisher: International Speech Communication Association
Abstract: In a recent work [1], a novel Delta Function-based Formant Shifting approach was proposed for speech intelligibility improvement. The underlying principle is to dynamically relocate the formants based on their occurrence in the spectrum away from the region of noise. The manner in which the formants are shifted is decided by the parameters of the Delta Function, the optimal values of which are evaluated using Comprehensive Learning Particle Swarm Optimization (CLPSO). Although effective, CLPSO is computationally expensive to the extent that it overshadows its merits in intelligibility improvement. As a solution to this, the current work aims to improve the Short-Time Objective Intelligibility (STOI) of (target) speech using a Delta Function that has been generated using a different (source) language. This transfer learning is based upon the relative positioning of the formant frequencies and pitch values of the source & target language datasets. The proposed approach is demonstrated and validated by subjecting it to experimentation with three different languages under variable noisy conditions. Copyright © 2021 ISCA.
URI: https://dx.doi.org/10.21437/Interspeech.2021-150
http://localhost:8080/xmlui/handle/123456789/132
ISBN: 978-1713836902
ISSN: 2308457X
Appears in Collections:Conference Paper

Files in This Item:
There are no files associated with this item.
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.