Presentation | 1997/1/16 N-best based unsupervised speaker adaptation Tomoko Matsui, Naoki Hashimoto, Tatsuo Matsuoka, Sadaoki Furui, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper proposes an N-best-based instantaneous speaker adaptation method. This method is effective even for difficult speakers whose decodings using speaker-independent (SI) models are error-prone, and for whom speaker adaptation techniques are truly needed. This method finds the combination of the HMM parameters and the word sequence that maximizes the likelihood of input speech through adaptation. Since it is too costly to attempt speaker adaptation for all possible word sequences, in order to reduce the search space without losing the correct sequence, the N-best paradigm of multiple-pass search strategies is used to calculate likely sequences. In addition, this paper introduces smoothed estimation and utterance verification into our N-best-based method. The smoothed estimation improves the performance for difficult speakers, and the utterance verification reduces the required amount of calculation. Moreover, in order to find an effective model-transformation for speaker adaptation, we compare the performance of two methods in which a mixture-mean bias is estimated based on maximum a posteriori or a mixture-mean linear-regression matrix is estimated by maximum likelihood linear regression. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | speech recognition / speaker adaptation / N-best / unsupervised adaptation / instantaneous adaptation |
Paper # | SP96-92 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 1997/1/16(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | N-best based unsupervised speaker adaptation |
Sub Title (in English) | |
Keyword(1) | speech recognition |
Keyword(2) | speaker adaptation |
Keyword(3) | N-best |
Keyword(4) | unsupervised adaptation |
Keyword(5) | instantaneous adaptation |
1st Author's Name | Tomoko Matsui |
1st Author's Affiliation | NTT Human Interface Laboratories() |
2nd Author's Name | Naoki Hashimoto |
2nd Author's Affiliation | Tokyo Institute of Technology |
3rd Author's Name | Tatsuo Matsuoka |
3rd Author's Affiliation | NTT Human Interface Laboratories |
4th Author's Name | Sadaoki Furui |
4th Author's Affiliation | NTT Human Interface Laboratories:Tokyo Institute of Technology |
Date | 1997/1/16 |
Paper # | SP96-92 |
Volume (vol) | vol.96 |
Number (no) | 448 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |