Presentation 2005/12/14
Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics
Randy GOMEZ, Tomoki TODA, Hiroshi SARUWATARI, Kiyohiro SHIKANO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Speaker adaptation techniques minimize the effect of speaker variability. It is neccessary to carry out speaker adaptation rapidly using a minimum amount of adaptation data in real-time application. We propose to improve the unsupervised speaker adaptation based on HMM-Sufficient Statistics using linear interpolation. This adaptation technique uses a single arbitrary utterance to provide data for adaptation by means of selecting N-best speakers' Sufficient Statistics. Reducing the selected N-best speakers implies reduction in adaptation time. However, recognition performance is degraded due to insufficiency of data needed to robustly adapt the model. We introduce linear interpolation of the global HMM-Sufficient Statistics to offset the negative effect of reducing N-best. We achieved a 50% reduction in adaptation time without recognition performance degradation. In our experiment, we have reduced the adaptation time from 10sec to 5sec without degrading the recognition performance. Furthermore we compared our method with Vocal Tract Length Normalization (VTLN), Maximum A Posteriori (MAP) and Maximum Likelihood Linear Regression. Moreover, we tested the performance of our approach in office, car, crowd and booth noise environments in 10dB, 15dB, 20dB and 25dB SNRs.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Rapid Unsupervised Speaker Adaptation / Noise Robustness / HMM Sufficient Statistics
Paper # NLC2005-59,SP2005-92
Date of Issue

Conference Information
Committee NLC
Conference Date 2005/12/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Evaluating Rapid Unsupervised Speaker Adaptation Using Linear Interpolation of HMM-Sufficient Statistics
Sub Title (in English)
Keyword(1) Rapid Unsupervised Speaker Adaptation
Keyword(2) Noise Robustness
Keyword(3) HMM Sufficient Statistics
1st Author's Name Randy GOMEZ
1st Author's Affiliation ()
2nd Author's Name Tomoki TODA
2nd Author's Affiliation
3rd Author's Name Hiroshi SARUWATARI
3rd Author's Affiliation
4th Author's Name Kiyohiro SHIKANO
4th Author's Affiliation
Date 2005/12/14
Paper # NLC2005-59,SP2005-92
Volume (vol) vol.105
Number (no) 493
Page pp.pp.-
#Pages 6
Date of Issue