Presentation 2012-12-21
Reduction of cross spectrum for feature-domain sound source separation
Atsushi ANDO, Kenta NIWA, Norihide KITAOKA, Kazuya TAKEDA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Speech source separation is utilized for recognition of simultaneous speech. Conventional source separation methods, especially blind source separation, have a huge computational cost because they require iterative learning steps to estimate separation filters. We therefore try to separate sounds in the feature domain, and the features are then used as inputs for speech recognition, in order to reduce the number of estimated separation filters. For this purpose, linearity between sources and recorded signals is needed in the domain. In this paper, we propose a cross spectrum reduction method between sources to approximate linearity. We prove that taking the average of the power spectra over multiple microphones can reduce the cross spectrum. Experimental results showed that the proposed method could reduce the cross spectrum, and that cepstrum distortions of separated signals were also improved.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) source separation / speech recognition / filterbank outputs / cross spectrum
Paper # SP2012-93
Date of Issue

Conference Information
Committee SP
Conference Date 2012/12/13(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Reduction of cross spectrum for feature-domain sound source separation
Sub Title (in English)
Keyword(1) source separation
Keyword(2) speech recognition
Keyword(3) filterbank outputs
Keyword(4) cross spectrum
1st Author's Name Atsushi ANDO
1st Author's Affiliation Graduate School of Information Science Nagoya University()
2nd Author's Name Kenta NIWA
2nd Author's Affiliation Nippon Telegraph and Telephone Corporation/NTT Media Intelligence Laboratories
3rd Author's Name Norihide KITAOKA
3rd Author's Affiliation Graduate School of Information Science Nagoya University
4th Author's Name Kazuya TAKEDA
4th Author's Affiliation Graduate School of Information Science Nagoya University
Date 2012-12-21
Paper # SP2012-93
Volume (vol) vol.112
Number (no) 369
Page pp.pp.-
#Pages 6
Date of Issue