Presentation 2019-03-14
[Poster Presentation] Voice activity detection under high levels of noise using gated convolutional neural networks
Li Li, Koshino Yuki, Matsumoto Mitsuo, Makino Shoji,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper deals with voice activity detection (VAD) tasks under high-level noise environments where signal-to-noise ratios (SNRs) are lower than -5 dB. Many VAD approaches have been developed during recent decades and shown to be efficient and effective. However, these approaches tend to fail the detection when SNRs become critically low in real situations, such as rescue robots in a disaster or navigation in a high-speed moving car. On the other hand, the deep learning techniques have achieved state-of-art results in many difficult classification tasks and shown the high potential to be able to solve the difficult VAD tasks. To achieve accurate VAD results under high-level noise environments, this paper proposes a gated convolutional neural network-based approach that is able to capture long- and short-term dependencies in time series as cues for detection. Experimental evaluations using high-level ego-noise of a hose-shaped rescue robot revealed that the proposed method was able to averagely achieve accurate VAD results in environments with SNR in the range of -30 dB to -5 dB.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Voice activity detection / high-level noise environments / deep learning / convolutional neural networks / smoothing
Paper # EA2018-102,SIP2018-108,SP2018-64
Date of Issue 2019-03-07 (EA, SIP, SP)

Conference Information
Committee EA / SIP / SP
Conference Date 2019/3/14(2days)
Place (in Japanese) (See Japanese page)
Place (in English) i+Land nagasaki (Nagasaki-shi)
Topics (in Japanese) (See Japanese page)
Topics (in English) Engineering/Electro Acoustics, Signal Processing, Speech, and Related Topics
Chair Suehiro Shimauchi(Kanazawa Inst. of Tech.) / Shogo Muramatsu(Niigata Univ.) / Yoichi Yamashita(Ritsumeikan Univ.)
Vice Chair Kenichi Furuya(Oita Univ.) / Kanji Watanabe(Akita Pref. Univ.) / Naoyuki Aikawa(TUS) / Kazunori Hayashi(Osaka City Univ) / Akinobu Ri(Nagoya Inst. of Tech.)
Secretary Kenichi Furuya(Shizuoka Inst. of Science and Tech.) / Kanji Watanabe(NHK) / Naoyuki Aikawa(Takushoku Univ.) / Kazunori Hayashi(Hiroshima Univ.) / Akinobu Ri(Kyoto Univ.)
Assistant Keisuke Imoto(Ritsumeikan Univ.) / Daisuke Morikawa(Toyama Pref Univ.) / Katsumi Konishi(Hosei Univ.) / hyihsin(Takushoku Univ.) / Tomoki Koriyama(Tokyo Inst. of Tech.) / Satoshi Kobashikawa(NTT)

Paper Information
Registration To Technical Committee on Engineering Acoustics / Technical Committee on Signal Processing / Technical Committee on Speech
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) [Poster Presentation] Voice activity detection under high levels of noise using gated convolutional neural networks
Sub Title (in English)
Keyword(1) Voice activity detection
Keyword(2) high-level noise environments
Keyword(3) deep learning
Keyword(4) convolutional neural networks
Keyword(5) smoothing
1st Author's Name Li Li
1st Author's Affiliation University of Tsukuba(Univ. Tsukuba)
2nd Author's Name Koshino Yuki
2nd Author's Affiliation University of Tsukuba(Univ. Tsukuba)
3rd Author's Name Matsumoto Mitsuo
3rd Author's Affiliation University of Tsukuba(Univ. Tsukuba)
4th Author's Name Makino Shoji
4th Author's Affiliation University of Tsukuba(Univ. Tsukuba)
Date 2019-03-14
Paper # EA2018-102,SIP2018-108,SP2018-64
Volume (vol) vol.118
Number (no) EA-495,SIP-496,SP-497
Page pp.pp.19-24(EA), pp.19-24(SIP), pp.19-24(SP),
#Pages 6
Date of Issue 2019-03-07 (EA, SIP, SP)