音響シーン識別のための注目すべき音を自動検出するニューラルビームフォーマの検討

Presentation	2022-06-17 Neural beamformer with automatic detection of notable sounds for acoustic scene classification Sota Ichikawa, Takeshi Yamada, Shoji Makino,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Recently, acoustic scene classification using a beamformer with multi-channel signals as input has been proposed. Generally, prior information such as the direction of arrival of the target sound is necessary to generate the spatial filter of the beamformer. However, it is difficult to apply a beamformer as a pre-processing method because it is not obvious what kind of sound is of interest in a particular acoustic scene and in which direction the sound is located. We have devised an approach in which the networks of a spatial filter generator and classifier are concatenated and simultaneously optimized. The goal of this approach is to automatically learn what sounds to be focused on and automatically generate spatial filters to emphasize them, without requiring any prior information such as the direction of arrival of the target sound or a reference signal. This paper proposes a loss function that incorporates the idea of the MVDR (Minimum Variance Distortionless Response) beamformer and verifies its effectiveness through experiments.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Acoustic Scene Classification / Neural Beamformer / Loss function / MVDR
Paper #	SP2022-10
Date of Issue	2022-06-10 (SP)

Conference Information
Committee	SP / IPSJ-MUS / IPSJ-SLP
Conference Date	2022/6/17(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	Online
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Tomoki Toda(Nagoya Univ.)
Vice Chair
Secretary	(NTT) / (Univ. of Electro-Comm.)
Assistant	Ryo Aihara(Mitsubishi Electric) / Daisuke Saito(Univ. of Tokyo)

Paper Information
Registration To	Technical Committee on Speech / Special Interest Group on Music and Computer / Special Interest Group on Spoken Language Processing
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Neural beamformer with automatic detection of notable sounds for acoustic scene classification
Sub Title (in English)
Keyword(1)	Acoustic Scene Classification
Keyword(2)	Neural Beamformer
Keyword(3)	Loss function
Keyword(4)	MVDR
1st Author's Name	Sota Ichikawa
1st Author's Affiliation	University of Tsukuba(Univ. of Tsukuba)
2nd Author's Name	Takeshi Yamada
2nd Author's Affiliation	University of Tsukuba(Univ. of Tsukuba)
3rd Author's Name	Shoji Makino
3rd Author's Affiliation	Waseda University/University of Tsukuba(Waseda Univ./Univ. of Tsukuba)
Date	2022-06-17
Paper #	SP2022-10
Volume (vol)	vol.122
Number (no)	SP-81
Page	pp.pp.35-40(SP),
#Pages	6
Date of Issue	2022-06-10 (SP)