Presentation 2002/1/17
A Unit Selection Algorithm for Japanese Speech Synthesis Based on Both Phoneme Unit and Diphone Unit
Tomoki TODA, Hisashi KAWAI, Minoru TSUZAKI, Kiyohiro SHIKANO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This report proposes a novel unit selection algorithm for Japanese Text-to-Speech (TTS) systems. Since Japanese syllables consist of CV (C : Consonant, V : Vowel) or V, except when a vowel is devoiced, CV units are generally used in concatenative TTS systems for Japanese. However, synthetic speech based on the CV-unit concatenation sometimes have discontinuities in case of V-V concatenation. In order to alleviate such discontinuities, longer units (CV^* or non-uniform units) have been proposed. However, the concatenation between V and V is still unavoidable. To address this problem, we propose a novel unit selection algorithm that incorporates not only phoneme units but also diphone units. The concatenation in the proposed algorithm is rerformed at the vowel center as well as at the phoneme boundary. Results of evaluation experiments clarify that the proposed algorithm outperforms the conventional algorithm.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Japanese Text-to-Speech / unit selection / vowel sequence / diphone unit / cost function
Paper # 2001-SP-120
Date of Issue

Conference Information
Committee SP
Conference Date 2002/1/17(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Unit Selection Algorithm for Japanese Speech Synthesis Based on Both Phoneme Unit and Diphone Unit
Sub Title (in English)
Keyword(1) Japanese Text-to-Speech
Keyword(2) unit selection
Keyword(3) vowel sequence
Keyword(4) diphone unit
Keyword(5) cost function
1st Author's Name Tomoki TODA
1st Author's Affiliation ATR Spoken Language Translation Research Laboratories : Graduate School of Information Science, Nara Institute of Science and Technology()
2nd Author's Name Hisashi KAWAI
2nd Author's Affiliation ATR Spoken Language Translation Research Laboratories
3rd Author's Name Minoru TSUZAKI
3rd Author's Affiliation ATR Spoken Language Translation Research Laboratories
4th Author's Name Kiyohiro SHIKANO
4th Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
Date 2002/1/17
Paper # 2001-SP-120
Volume (vol) vol.101
Number (no) 603
Page pp.pp.-
#Pages 8
Date of Issue