実環境における発話区間検出のための音響情報と画像情報の統合(音響と音声処理,音声強調,ロバスト音声認識)

Presentation	2003/4/17 Fusing Audio and Video Information toward Detection of Speech Events under Real Environments Takashi YOSHIMURA, Futoshi ASANO, Youichi MOTOMURA, Hideki ASOH, Naoyuki ICHIMURA, Kiyoshi YAMAMOTO, Satoshi NAKAMURA,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	In this paper, a method of detecting and separating speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by a stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, the information on the time and location of speech events can be known in a multiple-sound-source condition. Based on the detected speech event information, a maximum likelihood adaptive beamformer is constructed and the speech signal is separated from background noises and interferences.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Sound Localization / Human Tracking / Information Fusion / Bayesian Network
Paper #	EA2003-3,SP2003-3
Date of Issue

Paper Information
Registration To	Speech (SP)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Fusing Audio and Video Information toward Detection of Speech Events under Real Environments
Sub Title (in English)
Keyword(1)	Sound Localization
Keyword(2)	Human Tracking
Keyword(3)	Information Fusion
Keyword(4)	Bayesian Network
1st Author's Name	Takashi YOSHIMURA
1st Author's Affiliation	AIST()
2nd Author's Name	Futoshi ASANO
2nd Author's Affiliation	AIST
3rd Author's Name	Youichi MOTOMURA
3rd Author's Affiliation	AIST
4th Author's Name	Hideki ASOH
4th Author's Affiliation	AIST
5th Author's Name	Naoyuki ICHIMURA
5th Author's Affiliation	AIST
6th Author's Name	Kiyoshi YAMAMOTO
6th Author's Affiliation	University of Tsukuba
7th Author's Name	Satoshi NAKAMURA
7th Author's Affiliation	ATR Spoken Language Translation Research Laboratories
Date	2003/4/17
Paper #	EA2003-3,SP2003-3
Volume (vol)	vol.103
Number (no)	26
Page	pp.pp.-
#Pages	6
Date of Issue