Presentation | 2008-03-20 Robust Distant Speech Recognition by Combining Variable-trem spectrum Based Position-dependent CMN with Conventional CMN Longbiao WANG, Seiichi NAKAGAWA, Norihide KITAOKA, |
---|---|
PDF Download Page | ![]() |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Therefore, the conventional short-term spectrum based Cepstral Mean Normalization (CMN) is not effective under these conditions. In this paper, we propose a robust distant speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that a static speech segment (such as a vowel, for example) affected by reverberation can be modeled by a long-term cepstral analysis. Thus, the effect of long reverberation on a static speech segment may be compensated by the long-term spectrum based CMN. In this paper, the concept of combining short-term and long-term spectrum based CMN is extended to an environmentally robust speech recognition method based on Position-Dependent CMN (PDCMN). We call this Variable Term spectrum based PDCMN (VT-PDCMN). Since PDCMN/VT-PDCMN cannot normalize speaker variations, we also combine PDCMN/VT-PDCMN with conventional CMN in this study. We conducted the experiments based on our proposed method using limited vocabulary (100 words) distant-talking isolated word recognition in a real environment. The proposed method achieved a relative error reduction rate of 60.9% over the conventional short-term spectrum based CMN and 30.6% over the short-term spectrum based PDCMN. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Robust speech recognition / distant-talking environments / dereverberation / position-dependent CMN / conventional CMN |
Paper # | SP2007-197 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2008/3/13(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Robust Distant Speech Recognition by Combining Variable-trem spectrum Based Position-dependent CMN with Conventional CMN |
Sub Title (in English) | |
Keyword(1) | Robust speech recognition |
Keyword(2) | distant-talking environments |
Keyword(3) | dereverberation |
Keyword(4) | position-dependent CMN |
Keyword(5) | conventional CMN |
1st Author's Name | Longbiao WANG |
1st Author's Affiliation | Department of Information and Computer Sciences, Toyohashi University of Technology() |
2nd Author's Name | Seiichi NAKAGAWA |
2nd Author's Affiliation | Department of Information and Computer Sciences, Toyohashi University of Technology |
3rd Author's Name | Norihide KITAOKA |
3rd Author's Affiliation | Department of Media Science, Nagoya University |
Date | 2008-03-20 |
Paper # | SP2007-197 |
Volume (vol) | vol.107 |
Number (no) | 551 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |