雑音環境における複数モデルを用いた十分統計量に基づく教師なし話者適応(ポスターセッション)(第6回音声言語シンポジウム)

李 晃伸; 猿渡 洋; 鹿野 清宏

Presentation	2004/12/14 Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment Randy GOMEZ, Akinobu LEE, Hiroshi SARUWATARIHiroshi, Kiyohiro SHIKANO,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Speaker adaptation in speech recognition is necessary to achieve a high accuracy for wide varieties of speakers. On the other hand, using class-dependent (CD) acoustic model for specific gender/age class can result to a better accuracy than a single speaker-independent (SI) model. In this research, we extend the unsupervised speaker adaptation based on HMM Sufficient Statistics (HMM-SS) for multiple database and multiple initial models, given a wide varieties of speech database. As opposed to the conventional approach which utilizes only a single SI model as a base model, the proposed method makes use of multiple CD models to push up the performance of initial model before adaptation. A speaker's class is estimated from the N-best neighbor speakers by Gaussian Mixture Models (GMM) on the way of speaker selection, and the corresponding CD model is adopted as a base model. Then, the unsupervised speaker adaptation is performed by constructing HMM from HMM-SS of the selected speakers. Experiments were carried out on two database namely, adults and senior people by JNAS, and we performed testing under noisy environment conditions such as office, crowd, booth and car noise with 20dB SNR. Recognition results show that the proposed method based on multiple model outperforms the conventional approach. Moreover, comparison with the Maximum Likelihood Linear Regression (MLLR) adaptation with 10 supervised utterance confirms that our method perfroms better with only a single utterance input.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Unsupervised Adaptation / Noise Robustness / HMM Sufficient Statistics
Paper #	NLC2004-75,SP2004-115
Date of Issue

Conference Information
Committee	SP
Conference Date	2004/12/14(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Speech (SP)
Language	ENG
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Unsupervised Speaker Adaptation Based on HMM Sufficient Statistics Using Multiple Acoustic Models Under Noisy Environment
Sub Title (in English)
Keyword(1)	Unsupervised Adaptation
Keyword(2)	Noise Robustness
Keyword(3)	HMM Sufficient Statistics
1st Author's Name	Randy GOMEZ
1st Author's Affiliation	()
2nd Author's Name	Akinobu LEE
2nd Author's Affiliation
3rd Author's Name	Hiroshi SARUWATARIHiroshi
3rd Author's Affiliation
4th Author's Name	Kiyohiro SHIKANO
4th Author's Affiliation
Date	2004/12/14
Paper #	NLC2004-75,SP2004-115
Volume (vol)	vol.104
Number (no)	542
Page	pp.pp.-
#Pages	6
Date of Issue