Presentation | 2003/12/11 Noise-robust speech recognition using band-dependent weighted likelihood Yoshitaka NISHIMURA, Takahiro SHINOZAKI, Koji IWANO, Sadaoki FURUI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In most of the state-of-the-art automatic speech recognition (ASR) systems, speech is converted into a time function of the MFCC (Mel Frequency Cepstrum Coefficient) vector. However, the MFCC has a problem in that noise effects spread over all the coefficients even when the noise is limited within a narrow frequency range. If a spectrum feature is directly used, this problem can be avoided and thus robustness against noise could be expected to increase. Although various researches on using spectral domain features have been conducted, improvement of recognition performances has been reported only in limited noise conditions. This paper proposes a novel multi-band ASR method using a new log-spectral domain feature. Experimental results using bubble noise-added speech show that recognition performance is improved by the proposed method in comparison with the MFCC-based method. The performance is further improved by a spectral-peak weighting technique. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | spectrum / MFCC / multi-band ASR / spectral-peak |
Paper # | NLC2003-53 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2003/12/11(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Noise-robust speech recognition using band-dependent weighted likelihood |
Sub Title (in English) | |
Keyword(1) | spectrum |
Keyword(2) | MFCC |
Keyword(3) | multi-band ASR |
Keyword(4) | spectral-peak |
1st Author's Name | Yoshitaka NISHIMURA |
1st Author's Affiliation | Graduate School of Information Science and Engineering, Tokyo Institute of Technology() |
2nd Author's Name | Takahiro SHINOZAKI |
2nd Author's Affiliation | Graduate School of Information Science and Engineering, Tokyo Institute of Technology |
3rd Author's Name | Koji IWANO |
3rd Author's Affiliation | Graduate School of Information Science and Engineering, Tokyo Institute of Technology |
4th Author's Name | Sadaoki FURUI |
4th Author's Affiliation | Graduate School of Information Science and Engineering, Tokyo Institute of Technology |
Date | 2003/12/11 |
Paper # | NLC2003-53 |
Volume (vol) | vol.103 |
Number (no) | 517 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |