Presentation 2010-05-26
Improvement of speech recognition performance based on the conversion of Lombard features
Yuji UEMURA, Masanori MORISE, Takanobu NISHIURA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The automatic speech recognition (ASR) under noisy environments is focused as one of the challenging topics. Especially, the talking-speech under noisy environment much distorts compared with neutral talking-speech under quiet one. This distortion is called Lombard effects, and ASR performance degrades by them. They should strongly occur, subject to no auditory feedback for speaker. In conventional research, their features tend to be ascent of power, ascent of fundamental frequency (F0), flat of spectral envelope and higher-frequency shift of the first order formant frequency (F1) and the second order formant frequency (F2). The ASR performance without any especially operations degrades by affecting such features. In order to analyze Lombard features, we recorded Lombard speech and constracted Lombard speech corpus. We discriminate between neutral speech and Lombard speech used by analyzed features. We conducted subjective evaluation and objective evaluation. As a result, we confirmed discrimination rate over 80 % both evaluations. In this paper, we propose the new approach based on the voice conversion towards neutral speech from Lombard speech. We carried out evaluation experiments. As a result of experiments, we confirmed the ASR performance increases to 10 % for female speech and 4 % for male one with proposed method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Lombard effects / ASR / recognition performance / Lombard features
Paper # EA2010-1,SIP2010-1,SP2010-1
Date of Issue

Conference Information
Committee SIP
Conference Date 2010/5/19(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Signal Processing (SIP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Improvement of speech recognition performance based on the conversion of Lombard features
Sub Title (in English)
Keyword(1) Lombard effects
Keyword(2) ASR
Keyword(3) recognition performance
Keyword(4) Lombard features
1st Author's Name Yuji UEMURA
1st Author's Affiliation Graduate School of Science and Engineering, Ritsumeikan University()
2nd Author's Name Masanori MORISE
2nd Author's Affiliation College of Information Science and Engineering, Ritsumeikan University
3rd Author's Name Takanobu NISHIURA
3rd Author's Affiliation College of Information Science and Engineering, Ritsumeikan University
Date 2010-05-26
Paper # EA2010-1,SIP2010-1,SP2010-1
Volume (vol) vol.110
Number (no) 55
Page pp.pp.-
#Pages 6
Date of Issue