Presentation 2005/12/15
A Study on Endpoint Detection for Speech Recognition Based on Discriminative Feature Extraction
Koichi Yamamoto, Jabloun Firas, Klaus Reinhard, Akinori Kawamura,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Accurate endpoint detection is important to improve the speech recognition capability. This paper proposes a novel endpoint detection method which combines energy-based and likelihood ratio-based voice activity detection (VAD) criteria, where the likelihood ratio is calculated with speech/non-speech Gaussian mixture models (GMMs). Moreover, the proposed method introduces the discriminative feature extraction method (DFE) in order to improve the speech/non-speech classification. The DFE is used in the training of parameters required for calculating the likelihood ratio. Our experimental evaluation showed that the proposed method reduces the recognition error rate compared to a conventional energy-based technique.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Endpoint detection / VAD / DFE / GMM
Paper # NLC2005-93,SP2005-126
Date of Issue

Conference Information
Committee NLC
Conference Date 2005/12/15(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Study on Endpoint Detection for Speech Recognition Based on Discriminative Feature Extraction
Sub Title (in English)
Keyword(1) Endpoint detection
Keyword(2) VAD
Keyword(3) DFE
Keyword(4) GMM
1st Author's Name Koichi Yamamoto
1st Author's Affiliation Multimedia Laboratory, Corporate R&D Center, Toshiba Corp.()
2nd Author's Name Jabloun Firas
2nd Author's Affiliation Speech Technology Group, Cambridge Research Laboratory, Toshiba Research Europe Ltd.
3rd Author's Name Klaus Reinhard
3rd Author's Affiliation Speech Technology Group, Cambridge Research Laboratory, Toshiba Research Europe Ltd.
4th Author's Name Akinori Kawamura
4th Author's Affiliation Multimedia Laboratory, Corporate R&D Center, Toshiba Corp.
Date 2005/12/15
Paper # NLC2005-93,SP2005-126
Volume (vol) vol.105
Number (no) 494
Page pp.pp.-
#Pages 6
Date of Issue