Presentation 2008-03-20
Robust Distant Speech Recognition by Combining Variable-trem spectrum Based Position-dependent CMN with Conventional CMN
Longbiao WANG, Seiichi NAKAGAWA, Norihide KITAOKA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Therefore, the conventional short-term spectrum based Cepstral Mean Normalization (CMN) is not effective under these conditions. In this paper, we propose a robust distant speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that a static speech segment (such as a vowel, for example) affected by reverberation can be modeled by a long-term cepstral analysis. Thus, the effect of long reverberation on a static speech segment may be compensated by the long-term spectrum based CMN. In this paper, the concept of combining short-term and long-term spectrum based CMN is extended to an environmentally robust speech recognition method based on Position-Dependent CMN (PDCMN). We call this Variable Term spectrum based PDCMN (VT-PDCMN). Since PDCMN/VT-PDCMN cannot normalize speaker variations, we also combine PDCMN/VT-PDCMN with conventional CMN in this study. We conducted the experiments based on our proposed method using limited vocabulary (100 words) distant-talking isolated word recognition in a real environment. The proposed method achieved a relative error reduction rate of 60.9% over the conventional short-term spectrum based CMN and 30.6% over the short-term spectrum based PDCMN.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Robust speech recognition / distant-talking environments / dereverberation / position-dependent CMN / conventional CMN
Paper # SP2007-197
Date of Issue

Conference Information
Committee SP
Conference Date 2008/3/13(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Robust Distant Speech Recognition by Combining Variable-trem spectrum Based Position-dependent CMN with Conventional CMN
Sub Title (in English)
Keyword(1) Robust speech recognition
Keyword(2) distant-talking environments
Keyword(3) dereverberation
Keyword(4) position-dependent CMN
Keyword(5) conventional CMN
1st Author's Name Longbiao WANG
1st Author's Affiliation Department of Information and Computer Sciences, Toyohashi University of Technology()
2nd Author's Name Seiichi NAKAGAWA
2nd Author's Affiliation Department of Information and Computer Sciences, Toyohashi University of Technology
3rd Author's Name Norihide KITAOKA
3rd Author's Affiliation Department of Media Science, Nagoya University
Date 2008-03-20
Paper # SP2007-197
Volume (vol) vol.107
Number (no) 551
Page pp.pp.-
#Pages 6
Date of Issue