Presentation | 2022-06-17 Neural beamformer with automatic detection of notable sounds for acoustic scene classification Sota Ichikawa, Takeshi Yamada, Shoji Makino, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Recently, acoustic scene classification using a beamformer with multi-channel signals as input has been proposed. Generally, prior information such as the direction of arrival of the target sound is necessary to generate the spatial filter of the beamformer. However, it is difficult to apply a beamformer as a pre-processing method because it is not obvious what kind of sound is of interest in a particular acoustic scene and in which direction the sound is located. We have devised an approach in which the networks of a spatial filter generator and classifier are concatenated and simultaneously optimized. The goal of this approach is to automatically learn what sounds to be focused on and automatically generate spatial filters to emphasize them, without requiring any prior information such as the direction of arrival of the target sound or a reference signal. This paper proposes a loss function that incorporates the idea of the MVDR (Minimum Variance Distortionless Response) beamformer and verifies its effectiveness through experiments. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Acoustic Scene Classification / Neural Beamformer / Loss function / MVDR |
Paper # | SP2022-10 |
Date of Issue | 2022-06-10 (SP) |
Conference Information | |
Committee | SP / IPSJ-MUS / IPSJ-SLP |
---|---|
Conference Date | 2022/6/17(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Online |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Tomoki Toda(Nagoya Univ.) |
Vice Chair | |
Secretary | (NTT) / (Univ. of Electro-Comm.) |
Assistant | Ryo Aihara(Mitsubishi Electric) / Daisuke Saito(Univ. of Tokyo) |
Paper Information | |
Registration To | Technical Committee on Speech / Special Interest Group on Music and Computer / Special Interest Group on Spoken Language Processing |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Neural beamformer with automatic detection of notable sounds for acoustic scene classification |
Sub Title (in English) | |
Keyword(1) | Acoustic Scene Classification |
Keyword(2) | Neural Beamformer |
Keyword(3) | Loss function |
Keyword(4) | MVDR |
1st Author's Name | Sota Ichikawa |
1st Author's Affiliation | University of Tsukuba(Univ. of Tsukuba) |
2nd Author's Name | Takeshi Yamada |
2nd Author's Affiliation | University of Tsukuba(Univ. of Tsukuba) |
3rd Author's Name | Shoji Makino |
3rd Author's Affiliation | Waseda University/University of Tsukuba(Waseda Univ./Univ. of Tsukuba) |
Date | 2022-06-17 |
Paper # | SP2022-10 |
Volume (vol) | vol.122 |
Number (no) | SP-81 |
Page | pp.pp.35-40(SP), |
#Pages | 6 |
Date of Issue | 2022-06-10 (SP) |