GP-DNNハイブリッドモデルに基づく統計的音声合成の検討

Presentation	2018-01-20 A study on statistical speech synthesis based on GP-DNN hybrid model Tomoki Koriyama, Takao Kobayashi,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	We propose a novel approach to Gaussian process regression (GPR)-based speech synthesisin this paper. Since the conventional GPR-based speech synthesis was based on data partition with a decision tree, a decision tree was bottleneck of the performance of synthetic speech. In contrast, we propose a hybrid model of Gaussian process and deep neural network (DNN). In the hybrid model, DNN extracts context-derived featuresand the output of DNN is used as an input of Gaussian process. The parameters of DNN and GP are optimized using a minibatch-basedstochastic gradient descent method. From the subjective evaluation results, it can be seen that the proposed technique outperforms not only the conventionalGPR-based speech synthesis with decision treesbut also DNN-based speech synthesis.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Gaussian process regression / stochastic variational inference / neural network / statistical parametric speech synthesis
Paper #	SP2017-67
Date of Issue	2018-01-13 (SP)

Conference Information
Committee	SP / ASJ-H
Conference Date	2018/1/20(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	The University of Tokyo
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Yoichi Yamashita(Ritsumeikan Univ.) / 平原達也(富山県立大)
Vice Chair	Hiroki Mori(Utsunomiya Univ.) / 中川誠司(千葉大)
Secretary	Hiroki Mori(Shizuoka Univ.) / 中川誠司(Meijo Univ.)
Assistant	Kei Hashimoto(Nagoya Inst. of Tech.) / Satoshi Kobashikawa(NTT)

Paper Information
Registration To	Technical Committee on Speech / Auditory Research Meeting
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	A study on statistical speech synthesis based on GP-DNN hybrid model
Sub Title (in English)
Keyword(1)	Gaussian process regression
Keyword(2)	stochastic variational inference
Keyword(3)	neural network
Keyword(4)	statistical parametric speech synthesis
1st Author's Name	Tomoki Koriyama
1st Author's Affiliation	Tokyo Institute of Technology(Tokyo Tech)
2nd Author's Name	Takao Kobayashi
2nd Author's Affiliation	Tokyo Institute of Technology(Tokyo Tech)
Date	2018-01-20
Paper #	SP2017-67
Volume (vol)	vol.117
Number (no)	SP-393
Page	pp.pp.5-10(SP),
#Pages	6
Date of Issue	2018-01-13 (SP)