Presentation | 2005/12/14 Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics Randy GOMEZ, Tomoki TODA, Hiroshi SARUWATARI, Kiyohiro SHIKANO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Speaker adaptation techniques minimize the effect of speaker variability. It is neccessary to carry out speaker adaptation rapidly using a minimum amount of adaptation data in real-time application. We propose to improve the unsupervised speaker adaptation based on HMM-Sufficient Statistics using linear interpolation. This adaptation technique uses a single arbitrary utterance to provide data for adaptation by means of selecting N-best speakers' Sufficient Statistics. Reducing the selected N-best speakers implies reduction in adaptation time. However, recognition performance is degraded due to insufficiency of data needed to robustly adapt the model. We introduce linear interpolation of the global HMM-Sufficient Statistics to offset the negative effect of reducing N-best. We achieved a 50% reduction in adaptation time without recognition performance degradation. In our experiment, we have reduced the adaptation time from 10sec to 5sec without degrading the recognition performance. Furthermore we compared our method with Vocal Tract Length Normalization (VTLN), Maximum A Posteriori (MAP) and Maximum Likelihood Linear Regression. Moreover, we tested the performance of our approach in office, car, crowd and booth noise environments in 10dB, 15dB, 20dB and 25dB SNRs. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Rapid Unsupervised Speaker Adaptation / Noise Robustness / HMM Sufficient Statistics |
Paper # | NLC2005-59,SP2005-92 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2005/12/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics |
Sub Title (in English) | |
Keyword(1) | Rapid Unsupervised Speaker Adaptation |
Keyword(2) | Noise Robustness |
Keyword(3) | HMM Sufficient Statistics |
1st Author's Name | Randy GOMEZ |
1st Author's Affiliation | () |
2nd Author's Name | Tomoki TODA |
2nd Author's Affiliation | |
3rd Author's Name | Hiroshi SARUWATARI |
3rd Author's Affiliation | |
4th Author's Name | Kiyohiro SHIKANO |
4th Author's Affiliation | |
Date | 2005/12/14 |
Paper # | NLC2005-59,SP2005-92 |
Volume (vol) | vol.105 |
Number (no) | 493 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |