Presentation 2002/12/13
Speaker Indexing based on Speaker Model Selection and Automatic Speech Recognition in Discussions
Masafumi NISHIDA, Yuya AKITA, Tatsuya KAWAHARA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper addresses unsupervised speaker indexing for discussion audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIG according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive, it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods. The speaker index is useful for speaker adaptation of the acoustic model, which improves the performance of automatic speech recognition.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Speech recognition / Speaker recognition / Discussions / Unsupervised speaker indexing / Model selection / Bayesian information criterion
Paper # NLC2002-80
Date of Issue

Conference Information
Committee NLC
Conference Date 2002/12/13(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Speaker Indexing based on Speaker Model Selection and Automatic Speech Recognition in Discussions
Sub Title (in English)
Keyword(1) Speech recognition
Keyword(2) Speaker recognition
Keyword(3) Discussions
Keyword(4) Unsupervised speaker indexing
Keyword(5) Model selection
Keyword(6) Bayesian information criterion
1st Author's Name Masafumi NISHIDA
1st Author's Affiliation PRESTO, Japan Science and Technology corporation (JST)()
2nd Author's Name Yuya AKITA
2nd Author's Affiliation PRESTO, Japan Science and Technology corporation (JST):School of Informatics, Kyoto University
3rd Author's Name Tatsuya KAWAHARA
3rd Author's Affiliation PRESTO, Japan Science and Technology corporation (JST):School of Informatics, Kyoto University
Date 2002/12/13
Paper # NLC2002-80
Volume (vol) vol.102
Number (no) 528
Page pp.pp.-
#Pages 6
Date of Issue