Paper Abstract and Keywords |
Presentation |
2008-03-20 15:15
[Poster Presentation]
Robust Distant Speech Recognition by Combining Variable-term spectrum Based Position-dependent CMN with Conventional CMN Longbiao Wang, Seiichi Nakagawa (Toyohashi Univ. of Tech.), Norihide Kitaoka (Nagoya Univ.) SP2007-197 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
In a distant-talking environment, the duration of channel impulse response is longer than the short-term spectral analysis window. Therefore, the conventional short-term spectrum based Cepstral Mean Normalization (CMN) is not effective under these conditions. In this paper, we propose a robust distant speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that a static speech segment (such as a vowel, for example) affected by reverberation can be modeled by a long-term cepstral analysis. Thus, the effect of long reverberation on a static speech segment may be compensated by the long-term spectrum based CMN. In this paper, the concept of combining short-term and long-term spectrum based CMN is extended to an environmentally robust speech recognition method based on Position-Dependent CMN (PDCMN). We call this Variable Term spectrum based PDCMN (VT-PDCMN). Since PDCMN/VT-PDCMN cannot normalize speaker variations, we also combine PDCMN/VT-PDCMN with conventional CMN in this study. We conducted the experiments based on our proposed method using limited vocabulary (100 words) distant-talking isolated word recognition in a real environment. The proposed method achieved a relative error reduction rate of 60.9% over the conventional short-term spectrum based CMN and 30.6% over the short-term spectrum based PDCMN. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
Robust speech recognition / distant-talking environments / dereverberation / position-dependent CMN / conventional CMN / / / |
Reference Info. |
IEICE Tech. Rep., vol. 107, no. 551, SP2007-197, pp. 63-68, March 2008. |
Paper # |
SP2007-197 |
Date of Issue |
2008-03-13 (SP) |
ISSN |
Print edition: ISSN 0913-5685 Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
SP2007-197 |
Conference Information |
Committee |
SP |
Conference Date |
2008-03-20 - 2008-03-21 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Univ. Tokyo |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
International Workshop (Mar 20), Speech Production, Speech Perception, Hearing and Speech, etc. (Mar 21) |
Paper Information |
Registration To |
SP |
Conference Code |
2008-03-SP |
Language |
English |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Robust Distant Speech Recognition by Combining Variable-term spectrum Based Position-dependent CMN with Conventional CMN |
Sub Title (in English) |
|
Keyword(1) |
Robust speech recognition |
Keyword(2) |
distant-talking environments |
Keyword(3) |
dereverberation |
Keyword(4) |
position-dependent CMN |
Keyword(5) |
conventional CMN |
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Longbiao Wang |
1st Author's Affiliation |
Toyohashi University of Technology (Toyohashi Univ. of Tech.) |
2nd Author's Name |
Seiichi Nakagawa |
2nd Author's Affiliation |
Toyohashi University of Technology (Toyohashi Univ. of Tech.) |
3rd Author's Name |
Norihide Kitaoka |
3rd Author's Affiliation |
Nagoya University (Nagoya Univ.) |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2008-03-20 15:15:00 |
Presentation Time |
90 minutes |
Registration for |
SP |
Paper # |
SP2007-197 |
Volume (vol) |
vol.107 |
Number (no) |
no.551 |
Page |
pp.63-68 |
#Pages |
6 |
Date of Issue |
2008-03-13 (SP) |
|