Presentation | 2004/12/14 Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment Randy GOMEZ, Akinobu LEE, Hiroshi SARUWATARIHiroshi, Kiyohiro SHIKANO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Speaker adaptation in speech recognition is necessary to achieve a high accuracy for wide varieties of speakers. On the other hand, using class-dependent (CD) acoustic model for specific gender/age class can result to a better accuracy than a single speaker-independent (SI) model. In this research, we extend the unsupervised speaker adaptation based on HMM Sufficient Statistics (HMM-SS) for multiple database and multiple initial models, given a wide varieties of speech database. As opposed to the conventional approach which utilizes only a single SI model as a base model, the proposed method makes use of multiple CD models to push up the performance of initial model before adaptation. A speaker's class is estimated from the N-best neighbor speakers by Gaussian Mixture Models (GMM) on the way of speaker selection, and the corresponding CD model is adopted as a base model. Then, the unsupervised speaker adaptation is performed by constructing HMM from HMM-SS of the selected speakers. Experiments were carried out on two database namely, adults and senior people by JNAS, and we performed testing under noisy environment conditions such as office, crowd, booth and car noise with 20dB SNR. Recognition results show that the proposed method based on multiple model outperforms the conventional approach. Moreover, comparison with the Maximum Likelihood Linear Regression (MLLR) adaptation with 10 supervised utterance confirms that our method perfroms better with only a single utterance input. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Unsupervised Adaptation / Noise Robustness / HMM Sufficient Statistics |
Paper # | NLC2004-75,SP2004-115 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2004/12/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment |
Sub Title (in English) | |
Keyword(1) | Unsupervised Adaptation |
Keyword(2) | Noise Robustness |
Keyword(3) | HMM Sufficient Statistics |
1st Author's Name | Randy GOMEZ |
1st Author's Affiliation | () |
2nd Author's Name | Akinobu LEE |
2nd Author's Affiliation | |
3rd Author's Name | Hiroshi SARUWATARIHiroshi |
3rd Author's Affiliation | |
4th Author's Name | Kiyohiro SHIKANO |
4th Author's Affiliation | |
Date | 2004/12/14 |
Paper # | NLC2004-75,SP2004-115 |
Volume (vol) | vol.104 |
Number (no) | 542 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |