アンサンブル学習に基づく音韻継続長のモデル化(合成, 生成, 韻律, 一般)

Presentation	2005/8/19 Phone Duration Modeling Based on Ensemble Learning Junichi YAMAGISHI, Hisashi KAWAI, Toshio HIRAI, Takao KOBAYASHI,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Phone duration which controls rhythm and/or tempo of synthetic speech is one of important acoustic features for text-to-speech synthesis. Controlling phone duration can be viewed as an estimation problem of prediction function using several phonetic and prosodic features and linguistic information as explanatory variables of the function, and the methods based on multiple linear regression or regression tree have been applied to the duration prediction. In this study, to improve the prediction accuracy of the methods, we use "ensemble learning" that takes advantage of several prediction models. "Gradient boosting" is examined to efficiently improve the prediction accuracy of regression tree. The gradient boosting is recursive ensemble learning using residual error of the prediction models, and can improve the accuracy by small number of parameters. We apply the algorithm to the duration prediction of Japanese and Chinese and discuss the effectiveness.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Phone duration / Ensemble learning / Regression tree / Boosting / Bagging
Paper #	SP2005-53
Date of Issue

Paper Information
Registration To	Speech (SP)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Phone Duration Modeling Based on Ensemble Learning
Sub Title (in English)
Keyword(1)	Phone duration
Keyword(2)	Ensemble learning
Keyword(3)	Regression tree
Keyword(4)	Boosting
Keyword(5)	Bagging
1st Author's Name	Junichi YAMAGISHI
1st Author's Affiliation	Spoken Language Communication Research Laboratories, Advanced Telecommunications Research Institute International:Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology()
2nd Author's Name	Hisashi KAWAI
2nd Author's Affiliation	Spoken Language Communication Research Laboratories, Advanced Telecommunications Research Institute International:KDDI R&D Laboratories
3rd Author's Name	Toshio HIRAI
3rd Author's Affiliation	Spoken Language Communication Research Laboratories, Advanced Telecommunications Research Institute International
4th Author's Name	Takao KOBAYASHI
4th Author's Affiliation	Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
Date	2005/8/19
Paper #	SP2005-53
Volume (vol)	vol.105
Number (no)	253
Page	pp.pp.-
#Pages	6
Date of Issue