Presentation | 2010-01-21 A Study on Conversational Speech Synthesis Based on Average Voice Model Tomoki KORIYAMA, Takashi NOSE, Takao KOBAYASHI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes a conversational speech synthesis technique using average voice model and model adaptation based on hidden semi-Markov model (HSMM). In conversational speech, the acoustic features are affected by various factors such as speaker individuality, speaking style, and speaker's intention, and it is not easy to generate natural sounding speech using a small amount of speech data of a target speaker. To overcome this problem, the proposed technique utilizes an average voice model trained in advance using multiple speakers' speech data and adapts the model to the target speaker's one using a speaker adaptation technique. We can generate synthetic speech even if the available speech data of the target speaker is very limited. In this study, we evaluate the performance of the proposed technique by objective measures. We use two types of average voice models, one is trained with read speech, and the other with conversational speech. The experimental results show that the distortion of spectral and pitch features between synthetic and original speech samples decreases when using the proposed techniaue. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | conversational speech / spontaneous speech / HMM-based speech synthesis / average voice model / speaker adaptation / style adaptation |
Paper # | CQ2009-61,PRMU2009-160,SP2009-101,MVE2009-83 |
Date of Issue |
Conference Information | |
Committee | CQ |
---|---|
Conference Date | 2010/1/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Communication Quality (CQ) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Study on Conversational Speech Synthesis Based on Average Voice Model |
Sub Title (in English) | |
Keyword(1) | conversational speech |
Keyword(2) | spontaneous speech |
Keyword(3) | HMM-based speech synthesis |
Keyword(4) | average voice model |
Keyword(5) | speaker adaptation |
Keyword(6) | style adaptation |
1st Author's Name | Tomoki KORIYAMA |
1st Author's Affiliation | Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology() |
2nd Author's Name | Takashi NOSE |
2nd Author's Affiliation | Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology |
3rd Author's Name | Takao KOBAYASHI |
3rd Author's Affiliation | Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology |
Date | 2010-01-21 |
Paper # | CQ2009-61,PRMU2009-160,SP2009-101,MVE2009-83 |
Volume (vol) | vol.109 |
Number (no) | 373 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |