Presentation | 2002/12/13 Speaker Indexing based on Speaker Model Selection and Automatic Speech Recognition in Discussions Masafumi NISHIDA, Yuya AKITA, Tatsuya KAWAHARA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper addresses unsupervised speaker indexing for discussion audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIG according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive, it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods. The speaker index is useful for speaker adaptation of the acoustic model, which improves the performance of automatic speech recognition. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Speech recognition / Speaker recognition / Discussions / Unsupervised speaker indexing / Model selection / Bayesian information criterion |
Paper # | NLC2002-80 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2002/12/13(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Speaker Indexing based on Speaker Model Selection and Automatic Speech Recognition in Discussions |
Sub Title (in English) | |
Keyword(1) | Speech recognition |
Keyword(2) | Speaker recognition |
Keyword(3) | Discussions |
Keyword(4) | Unsupervised speaker indexing |
Keyword(5) | Model selection |
Keyword(6) | Bayesian information criterion |
1st Author's Name | Masafumi NISHIDA |
1st Author's Affiliation | PRESTO, Japan Science and Technology corporation (JST)() |
2nd Author's Name | Yuya AKITA |
2nd Author's Affiliation | PRESTO, Japan Science and Technology corporation (JST):School of Informatics, Kyoto University |
3rd Author's Name | Tatsuya KAWAHARA |
3rd Author's Affiliation | PRESTO, Japan Science and Technology corporation (JST):School of Informatics, Kyoto University |
Date | 2002/12/13 |
Paper # | NLC2002-80 |
Volume (vol) | vol.102 |
Number (no) | 528 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |