Presentation | 2001/12/13 Vocal Tract Length Normalization Using Linear Transformation based on Maximum Likelihood Estimation Jun ROKUI, MITSURU Nakai, Hiroshi SHIMODAIRA, Shigeki SAGAYAMA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Vocal tract length normalization (VTLN) is one of the popular speaker adaptation techniques for speech recognition. The present study proposes a new VTLN algorithm in which expectation-maximization (EM) based parameter adaptation of HMM to vocal tract length is achieved in the mel-cepstral domain by utilizing a linear transformation model. Compared to other existing approaches based on bi-linear transformation for VTLN where a specific non-linear frequency warping function is employed in the spectrum domain and parameter adaptation of HMM is carried out in the cepstral domain, the proposed approach assumes a linear frequency warping with a single scaling factor and equivalent operation is modeled in the mel-cepstral domain by using a first order Taylor series approximation. The proposed scheme demonstrates significant improvement of recognition performance in a speaker independent word recognition task. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Vocal Tract Length Normalization / Linear Transformation / Maximum Likelihood Estimation / Speaker Adaptation / Speaker Normalization |
Paper # | NLC2001-52,SP2001-87 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2001/12/13(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Vocal Tract Length Normalization Using Linear Transformation based on Maximum Likelihood Estimation |
Sub Title (in English) | |
Keyword(1) | Vocal Tract Length Normalization |
Keyword(2) | Linear Transformation |
Keyword(3) | Maximum Likelihood Estimation |
Keyword(4) | Speaker Adaptation |
Keyword(5) | Speaker Normalization |
1st Author's Name | Jun ROKUI |
1st Author's Affiliation | Japan Advanced Institute of Science and Technology, Hokuriku.Dept of Information Science.() |
2nd Author's Name | MITSURU Nakai |
2nd Author's Affiliation | Japan Advanced Institute of Science and Technology, Hokuriku.Dept of Information Science. |
3rd Author's Name | Hiroshi SHIMODAIRA |
3rd Author's Affiliation | Japan Advanced Institute of Science and Technology, Hokuriku.Dept of Information Science. |
4th Author's Name | Shigeki SAGAYAMA |
4th Author's Affiliation | The University of Tokyo.Graduate School of Information Science and Technology. |
Date | 2001/12/13 |
Paper # | NLC2001-52,SP2001-87 |
Volume (vol) | vol.101 |
Number (no) | 520 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |