Presentation 1997/1/16
N-best based unsupervised speaker adaptation
Tomoko Matsui, Naoki Hashimoto, Tatsuo Matsuoka, Sadaoki Furui,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper proposes an N-best-based instantaneous speaker adaptation method. This method is effective even for difficult speakers whose decodings using speaker-independent (SI) models are error-prone, and for whom speaker adaptation techniques are truly needed. This method finds the combination of the HMM parameters and the word sequence that maximizes the likelihood of input speech through adaptation. Since it is too costly to attempt speaker adaptation for all possible word sequences, in order to reduce the search space without losing the correct sequence, the N-best paradigm of multiple-pass search strategies is used to calculate likely sequences. In addition, this paper introduces smoothed estimation and utterance verification into our N-best-based method. The smoothed estimation improves the performance for difficult speakers, and the utterance verification reduces the required amount of calculation. Moreover, in order to find an effective model-transformation for speaker adaptation, we compare the performance of two methods in which a mixture-mean bias is estimated based on maximum a posteriori or a mixture-mean linear-regression matrix is estimated by maximum likelihood linear regression.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) speech recognition / speaker adaptation / N-best / unsupervised adaptation / instantaneous adaptation
Paper # SP96-92
Date of Issue

Conference Information
Committee SP
Conference Date 1997/1/16(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) N-best based unsupervised speaker adaptation
Sub Title (in English)
Keyword(1) speech recognition
Keyword(2) speaker adaptation
Keyword(3) N-best
Keyword(4) unsupervised adaptation
Keyword(5) instantaneous adaptation
1st Author's Name Tomoko Matsui
1st Author's Affiliation NTT Human Interface Laboratories()
2nd Author's Name Naoki Hashimoto
2nd Author's Affiliation Tokyo Institute of Technology
3rd Author's Name Tatsuo Matsuoka
3rd Author's Affiliation NTT Human Interface Laboratories
4th Author's Name Sadaoki Furui
4th Author's Affiliation NTT Human Interface Laboratories:Tokyo Institute of Technology
Date 1997/1/16
Paper # SP96-92
Volume (vol) vol.96
Number (no) 448
Page pp.pp.-
#Pages 8
Date of Issue