Presentation | 2002/1/17 A Unit Selection Algorithm for Japanese Speech Synthesis Based on Both Phoneme Unit and Diphone Unit Tomoki TODA, Hisashi KAWAI, Minoru TSUZAKI, Kiyohiro SHIKANO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This report proposes a novel unit selection algorithm for Japanese Text-to-Speech (TTS) systems. Since Japanese syllables consist of CV (C : Consonant, V : Vowel) or V, except when a vowel is devoiced, CV units are generally used in concatenative TTS systems for Japanese. However, synthetic speech based on the CV-unit concatenation sometimes have discontinuities in case of V-V concatenation. In order to alleviate such discontinuities, longer units (CV^* or non-uniform units) have been proposed. However, the concatenation between V and V is still unavoidable. To address this problem, we propose a novel unit selection algorithm that incorporates not only phoneme units but also diphone units. The concatenation in the proposed algorithm is rerformed at the vowel center as well as at the phoneme boundary. Results of evaluation experiments clarify that the proposed algorithm outperforms the conventional algorithm. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Japanese Text-to-Speech / unit selection / vowel sequence / diphone unit / cost function |
Paper # | 2001-SP-120 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2002/1/17(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Unit Selection Algorithm for Japanese Speech Synthesis Based on Both Phoneme Unit and Diphone Unit |
Sub Title (in English) | |
Keyword(1) | Japanese Text-to-Speech |
Keyword(2) | unit selection |
Keyword(3) | vowel sequence |
Keyword(4) | diphone unit |
Keyword(5) | cost function |
1st Author's Name | Tomoki TODA |
1st Author's Affiliation | ATR Spoken Language Translation Research Laboratories : Graduate School of Information Science, Nara Institute of Science and Technology() |
2nd Author's Name | Hisashi KAWAI |
2nd Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
3rd Author's Name | Minoru TSUZAKI |
3rd Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
4th Author's Name | Kiyohiro SHIKANO |
4th Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology |
Date | 2002/1/17 |
Paper # | 2001-SP-120 |
Volume (vol) | vol.101 |
Number (no) | 603 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |