Presentation | 2000/12/14 A Robust End Point Detection by Speaker's Facial Image Kazumasa MURAI, Keisuke NOMA, Kenichi KUMATANI, Tomoko MATSUI, Satoshi NAKAMURA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, we propose a method to detect the end points of speaking sections (EPD : End Point Detection) by visual information. It is well known that the accuracy of EPD affects speech recognition accuracy. Detecting the speech end points from a noisy audio signal is difficult because the speech is masked by the audio noise. We propose a method for EPD that uses image of the speaker's facial motion that are not affected by audio noise. Our method locates the skin area by color information and estimates the area that includes the speech organs. Then the end points are detected by the speed at which the image alternates. An evaluation experiment also confirms that the proposed method is robust with respect to visual noise. Its accuracy with/without visual noise is 99.8% while audio (SNR 25dB) EPD is 97.5%. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Speech Recognition / Speaking Section / Facial Image / Skin Color / End Point Detection |
Paper # | NLC2000-39,SP2000-87 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2000/12/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Robust End Point Detection by Speaker's Facial Image |
Sub Title (in English) | |
Keyword(1) | Speech Recognition |
Keyword(2) | Speaking Section |
Keyword(3) | Facial Image |
Keyword(4) | Skin Color |
Keyword(5) | End Point Detection |
1st Author's Name | Kazumasa MURAI |
1st Author's Affiliation | ATR Spoken Language Translation Research Laboratories : Graduate School of Information Science, Nara Institute of Science and Technology() |
2nd Author's Name | Keisuke NOMA |
2nd Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology |
3rd Author's Name | Kenichi KUMATANI |
3rd Author's Affiliation | ATR Spoken Language Translation Research Laboratories : Graduate School of Information Science, Nara Institute of Science and Technology |
4th Author's Name | Tomoko MATSUI |
4th Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
5th Author's Name | Satoshi NAKAMURA |
5th Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
Date | 2000/12/14 |
Paper # | NLC2000-39,SP2000-87 |
Volume (vol) | vol.100 |
Number (no) | 520 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |