平均声に基づく対話音声合成に関する検討(テーマセッション,クロスモーダル)

郡山 知樹; 能勢 隆; 小林 隆夫

Presentation	2010-01-21 A Study on Conversational Speech Synthesis Based on Average Voice Model Tomoki KORIYAMA, Takashi NOSE, Takao KOBAYASHI,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	This paper describes a conversational speech synthesis technique using average voice model and model adaptation based on hidden semi-Markov model (HSMM). In conversational speech, the acoustic features are affected by various factors such as speaker individuality, speaking style, and speaker's intention, and it is not easy to generate natural sounding speech using a small amount of speech data of a target speaker. To overcome this problem, the proposed technique utilizes an average voice model trained in advance using multiple speakers' speech data and adapts the model to the target speaker's one using a speaker adaptation technique. We can generate synthetic speech even if the available speech data of the target speaker is very limited. In this study, we evaluate the performance of the proposed technique by objective measures. We use two types of average voice models, one is trained with read speech, and the other with conversational speech. The experimental results show that the distortion of spectral and pitch features between synthetic and original speech samples decreases when using the proposed techniaue.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	conversational speech / spontaneous speech / HMM-based speech synthesis / average voice model / speaker adaptation / style adaptation
Paper #	CQ2009-61,PRMU2009-160,SP2009-101,MVE2009-83
Date of Issue

Conference Information
Committee	CQ
Conference Date	2010/1/14(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Communication Quality (CQ)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	A Study on Conversational Speech Synthesis Based on Average Voice Model
Sub Title (in English)
Keyword(1)	conversational speech
Keyword(2)	spontaneous speech
Keyword(3)	HMM-based speech synthesis
Keyword(4)	average voice model
Keyword(5)	speaker adaptation
Keyword(6)	style adaptation
1st Author's Name	Tomoki KORIYAMA
1st Author's Affiliation	Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology()
2nd Author's Name	Takashi NOSE
2nd Author's Affiliation	Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
3rd Author's Name	Takao KOBAYASHI
3rd Author's Affiliation	Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
Date	2010-01-21
Paper #	CQ2009-61,PRMU2009-160,SP2009-101,MVE2009-83
Volume (vol)	vol.109
Number (no)	373
Page	pp.pp.-
#Pages	6
Date of Issue