Presentation | 2003/8/15 Speaker adaptation using context clustering decision tree for HMM-based speech synthesis Junichi YAMAGISHI, Takashi MASUKO, Keiichi TOKUDA, Takao KOBAYASHl, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In order to synthesize speech with arbitrary individualities and/or emotional expressions, segment-based features have to be used as well as frame-based features. In this paper, to realize MLLR (Maximum Likelihood Liner Regression) based speaker adaptation reflecting those segment-based features for HMM-based speech synthesis, we propose a technique for applying context clustering decision trees constructed in a training stage to tying of regression matrices. Since a set of questions used for constructing context clustering decision trees contains questions related to segment-based features such as position and length, it is possible to incorporate segment-based features into the adaptation. We show that synthesized speech from the adapted model using the proposed technique can have segment-based features. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | HMM-based speech synthesis / Speaker adaptation / Maximum likelihood liner regression / Decision tree / Voice characteristics and prosodic features / Segment-based features |
Paper # | SP2003-79 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2003/8/15(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Speaker adaptation using context clustering decision tree for HMM-based speech synthesis |
Sub Title (in English) | |
Keyword(1) | HMM-based speech synthesis |
Keyword(2) | Speaker adaptation |
Keyword(3) | Maximum likelihood liner regression |
Keyword(4) | Decision tree |
Keyword(5) | Voice characteristics and prosodic features |
Keyword(6) | Segment-based features |
1st Author's Name | Junichi YAMAGISHI |
1st Author's Affiliation | Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology() |
2nd Author's Name | Takashi MASUKO |
2nd Author's Affiliation | Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology |
3rd Author's Name | Keiichi TOKUDA |
3rd Author's Affiliation | Department of Computer Science, Nagoya Institute of Technology |
4th Author's Name | Takao KOBAYASHl |
4th Author's Affiliation | Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology |
Date | 2003/8/15 |
Paper # | SP2003-79 |
Volume (vol) | vol.103 |
Number (no) | 264 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |