Presentation | 2011-06-23 Speech recognition in mixed sound of speech and music by vector quantization and non-negative matrix factorization Shoichi NAKANO, Kazumasa YAMAMOTO, Seiichi NAKAGAWA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | For speech recognition in the presence of noise, it is necessary to reduce the effect of the noise. The spectral subtraction and Wiener filter based methods are general techniques for noise removal. Although these methods are valid for stationary noise, they are not effective for non-stationary noise. This paper describes a speech recognition method for mixed sound, consisting of speech and music, that removes the music only based on vector quantization and non-negative matrix factorization. For isolated word recognition using the clean speech model, an improvement of about 15% was obtained compared with the case of not removing music. Furthermore, a high recognition rate of about 90% was achieved, even under the 0 dB condition using a model trained from the mixed sound after removing the music according. We also applied the proposed method to piano trio, and confirmed the effectiveness. Finally, we also compared the human performance by listening test and machine recognition performance. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | speech recognition / mixed sound / music removal / piano trio / vector quantization / non-negative matrix factorization |
Paper # | SP2011-34 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2011/6/16(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Speech recognition in mixed sound of speech and music by vector quantization and non-negative matrix factorization |
Sub Title (in English) | |
Keyword(1) | speech recognition |
Keyword(2) | mixed sound |
Keyword(3) | music removal |
Keyword(4) | piano trio |
Keyword(5) | vector quantization |
Keyword(6) | non-negative matrix factorization |
1st Author's Name | Shoichi NAKANO |
1st Author's Affiliation | Toyohashi University of Technology() |
2nd Author's Name | Kazumasa YAMAMOTO |
2nd Author's Affiliation | Toyohashi University of Technology |
3rd Author's Name | Seiichi NAKAGAWA |
3rd Author's Affiliation | Toyohashi University of Technology |
Date | 2011-06-23 |
Paper # | SP2011-34 |
Volume (vol) | vol.111 |
Number (no) | 97 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |