Presentation | 2019-03-14 [Poster Presentation] Voice activity detection under high levels of noise using gated convolutional neural networks Li Li, Koshino Yuki, Matsumoto Mitsuo, Makino Shoji, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper deals with voice activity detection (VAD) tasks under high-level noise environments where signal-to-noise ratios (SNRs) are lower than -5 dB. Many VAD approaches have been developed during recent decades and shown to be efficient and effective. However, these approaches tend to fail the detection when SNRs become critically low in real situations, such as rescue robots in a disaster or navigation in a high-speed moving car. On the other hand, the deep learning techniques have achieved state-of-art results in many difficult classification tasks and shown the high potential to be able to solve the difficult VAD tasks. To achieve accurate VAD results under high-level noise environments, this paper proposes a gated convolutional neural network-based approach that is able to capture long- and short-term dependencies in time series as cues for detection. Experimental evaluations using high-level ego-noise of a hose-shaped rescue robot revealed that the proposed method was able to averagely achieve accurate VAD results in environments with SNR in the range of -30 dB to -5 dB. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Voice activity detection / high-level noise environments / deep learning / convolutional neural networks / smoothing |
Paper # | EA2018-102,SIP2018-108,SP2018-64 |
Date of Issue | 2019-03-07 (EA, SIP, SP) |
Conference Information | |
Committee | EA / SIP / SP |
---|---|
Conference Date | 2019/3/14(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | i+Land nagasaki (Nagasaki-shi) |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Engineering/Electro Acoustics, Signal Processing, Speech, and Related Topics |
Chair | Suehiro Shimauchi(Kanazawa Inst. of Tech.) / Shogo Muramatsu(Niigata Univ.) / Yoichi Yamashita(Ritsumeikan Univ.) |
Vice Chair | Kenichi Furuya(Oita Univ.) / Kanji Watanabe(Akita Pref. Univ.) / Naoyuki Aikawa(TUS) / Kazunori Hayashi(Osaka City Univ) / Akinobu Ri(Nagoya Inst. of Tech.) |
Secretary | Kenichi Furuya(Shizuoka Inst. of Science and Tech.) / Kanji Watanabe(NHK) / Naoyuki Aikawa(Takushoku Univ.) / Kazunori Hayashi(Hiroshima Univ.) / Akinobu Ri(Kyoto Univ.) |
Assistant | Keisuke Imoto(Ritsumeikan Univ.) / Daisuke Morikawa(Toyama Pref Univ.) / Katsumi Konishi(Hosei Univ.) / hyihsin(Takushoku Univ.) / Tomoki Koriyama(Tokyo Inst. of Tech.) / Satoshi Kobashikawa(NTT) |
Paper Information | |
Registration To | Technical Committee on Engineering Acoustics / Technical Committee on Signal Processing / Technical Committee on Speech |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | [Poster Presentation] Voice activity detection under high levels of noise using gated convolutional neural networks |
Sub Title (in English) | |
Keyword(1) | Voice activity detection |
Keyword(2) | high-level noise environments |
Keyword(3) | deep learning |
Keyword(4) | convolutional neural networks |
Keyword(5) | smoothing |
1st Author's Name | Li Li |
1st Author's Affiliation | University of Tsukuba(Univ. Tsukuba) |
2nd Author's Name | Koshino Yuki |
2nd Author's Affiliation | University of Tsukuba(Univ. Tsukuba) |
3rd Author's Name | Matsumoto Mitsuo |
3rd Author's Affiliation | University of Tsukuba(Univ. Tsukuba) |
4th Author's Name | Makino Shoji |
4th Author's Affiliation | University of Tsukuba(Univ. Tsukuba) |
Date | 2019-03-14 |
Paper # | EA2018-102,SIP2018-108,SP2018-64 |
Volume (vol) | vol.118 |
Number (no) | EA-495,SIP-496,SP-497 |
Page | pp.pp.19-24(EA), pp.19-24(SIP), pp.19-24(SP), |
#Pages | 6 |
Date of Issue | 2019-03-07 (EA, SIP, SP) |