Presentation | 2002/4/19 Robust F_0 Extraction for Noisy Environments and Its Use for Speech Recognition Koji IWANO, Takahiro SEKI, Sadaoki FURUI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper proposes a noise robust speech recognition method using prosodic information. In Japanese, fundamental frequency (F_0) contour represents phrase intonation and word accent information. Consequently, it conveys information about prosodic phrase and word boundaries. We have developed a robust F_0 extraction method using Hough transform, which yields high extracting rate under various noise conditions. In this paper, we propose a noise robust speech recognition method using syllable HMMs which model both segmental spectral features and F_0 contour information. Speaker-independent experiments are conducted using connected digits uttered by 11 male speakers in various kinds of noise and SNR conditions. The recognition accuracy is improved in all noise conditions, and the best absolute improvement of digit accuracy is about 4.7%. This improvement is achieved due to the more precise digit boundary detection by the robust prosodic information. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Noise robust speech recognition / Prosodic information / Fundamental frequency (F_0) contour |
Paper # | SP2002-13 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2002/4/19(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Robust F_0 Extraction for Noisy Environments and Its Use for Speech Recognition |
Sub Title (in English) | |
Keyword(1) | Noise robust speech recognition |
Keyword(2) | Prosodic information |
Keyword(3) | Fundamental frequency (F_0) contour |
1st Author's Name | Koji IWANO |
1st Author's Affiliation | Department of Computer Science, Tokyo Institute of Technology() |
2nd Author's Name | Takahiro SEKI |
2nd Author's Affiliation | Department of Computer Science, Tokyo Institute of Technology |
3rd Author's Name | Sadaoki FURUI |
3rd Author's Affiliation | Department of Computer Science, Tokyo Institute of Technology |
Date | 2002/4/19 |
Paper # | SP2002-13 |
Volume (vol) | vol.102 |
Number (no) | 35 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |