音声区間検出と雑音抑圧の統合法を用いた雑音下音声認識(音響処理・話者同定,第10回音声言語シンポジウム)

Presentation	2008-12-09 Noisy speech recognition using integrated method of statistical model-based voice activity detection and noise suppression Masakiyo FUJIMOTO, Kentaro ISHIZUKA, Tomohiro NAKATANI,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	This paper addresses robust front-end processing for automatic speech recognition in noise. The proposed method integrates voice activity detection (VAD) and noise suppression, and consists of three core techniques, i.e., (1) statistical model sharing, (2) Wiener filter design by using speech/non-speech probabilities, and (3) VAD improvement by using enhance speech. In addition, the proposed method can perform sequential processing without frame delay. In an evaluation, the proposed method significantly improves accuracy of concatenated speech recognition without frame delay. In addition, we investigate to combine cepstrum mean normalization and sequential acoustic model adaptation with the proposed method.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	integrated front-end processing / voice activity detection / noise suppression / sequential processing / speech recognition
Paper #	NLC2008-26,SP2008-81
Date of Issue

Paper Information
Registration To	Natural Language Understanding and Models of Communication (NLC)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Noisy speech recognition using integrated method of statistical model-based voice activity detection and noise suppression
Sub Title (in English)
Keyword(1)	integrated front-end processing
Keyword(2)	voice activity detection
Keyword(3)	noise suppression
Keyword(4)	sequential processing
Keyword(5)	speech recognition
1st Author's Name	Masakiyo FUJIMOTO
1st Author's Affiliation	NTT Communicaition Science Laboratories, NTT Corp.()
2nd Author's Name	Kentaro ISHIZUKA
2nd Author's Affiliation	NTT Communicaition Science Laboratories, NTT Corp.
3rd Author's Name	Tomohiro NAKATANI
3rd Author's Affiliation	NTT Communicaition Science Laboratories, NTT Corp.
Date	2008-12-09
Paper #	NLC2008-26,SP2008-81
Volume (vol)	vol.108
Number (no)	337
Page	pp.pp.-
#Pages	6
Date of Issue