Presentation 2010-01-21
A Study on Conversational Speech Synthesis Based on Average Voice Model
Tomoki KORIYAMA, Takashi NOSE, Takao KOBAYASHI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper describes a conversational speech synthesis technique using average voice model and model adaptation based on hidden semi-Markov model (HSMM). In conversational speech, the acoustic features are affected by various factors such as speaker individuality, speaking style, and speaker's intention, and it is not easy to generate natural sounding speech using a small amount of speech data of a target speaker. To overcome this problem, the proposed technique utilizes an average voice model trained in advance using multiple speakers' speech data and adapts the model to the target speaker's one using a speaker adaptation technique. We can generate synthetic speech even if the available speech data of the target speaker is very limited. In this study, we evaluate the performance of the proposed technique by objective measures. We use two types of average voice models, one is trained with read speech, and the other with conversational speech. The experimental results show that the distortion of spectral and pitch features between synthetic and original speech samples decreases when using the proposed techniaue.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) conversational speech / spontaneous speech / HMM-based speech synthesis / average voice model / speaker adaptation / style adaptation
Paper # CQ2009-61,PRMU2009-160,SP2009-101,MVE2009-83
Date of Issue

Conference Information
Committee CQ
Conference Date 2010/1/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Communication Quality (CQ)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Study on Conversational Speech Synthesis Based on Average Voice Model
Sub Title (in English)
Keyword(1) conversational speech
Keyword(2) spontaneous speech
Keyword(3) HMM-based speech synthesis
Keyword(4) average voice model
Keyword(5) speaker adaptation
Keyword(6) style adaptation
1st Author's Name Tomoki KORIYAMA
1st Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology()
2nd Author's Name Takashi NOSE
2nd Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
3rd Author's Name Takao KOBAYASHI
3rd Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
Date 2010-01-21
Paper # CQ2009-61,PRMU2009-160,SP2009-101,MVE2009-83
Volume (vol) vol.109
Number (no) 373
Page pp.pp.-
#Pages 6
Date of Issue