Presentation 2018-01-20
A study on statistical speech synthesis based on GP-DNN hybrid model
Tomoki Koriyama, Takao Kobayashi,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We propose a novel approach to Gaussian process regression (GPR)-based speech synthesisin this paper. Since the conventional GPR-based speech synthesis was based on data partition with a decision tree, a decision tree was bottleneck of the performance of synthetic speech. In contrast, we propose a hybrid model of Gaussian process and deep neural network (DNN). In the hybrid model, DNN extracts context-derived featuresand the output of DNN is used as an input of Gaussian process. The parameters of DNN and GP are optimized using a minibatch-basedstochastic gradient descent method. From the subjective evaluation results, it can be seen that the proposed technique outperforms not only the conventionalGPR-based speech synthesis with decision treesbut also DNN-based speech synthesis.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Gaussian process regression / stochastic variational inference / neural network / statistical parametric speech synthesis
Paper # SP2017-67
Date of Issue 2018-01-13 (SP)

Conference Information
Committee SP / ASJ-H
Conference Date 2018/1/20(2days)
Place (in Japanese) (See Japanese page)
Place (in English) The University of Tokyo
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Yoichi Yamashita(Ritsumeikan Univ.) / 平原 達也(富山県立大)
Vice Chair Hiroki Mori(Utsunomiya Univ.) / 中川 誠司(千葉大)
Secretary Hiroki Mori(Shizuoka Univ.) / 中川 誠司(Meijo Univ.)
Assistant Kei Hashimoto(Nagoya Inst. of Tech.) / Satoshi Kobashikawa(NTT)

Paper Information
Registration To Technical Committee on Speech / Auditory Research Meeting
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A study on statistical speech synthesis based on GP-DNN hybrid model
Sub Title (in English)
Keyword(1) Gaussian process regression
Keyword(2) stochastic variational inference
Keyword(3) neural network
Keyword(4) statistical parametric speech synthesis
1st Author's Name Tomoki Koriyama
1st Author's Affiliation Tokyo Institute of Technology(Tokyo Tech)
2nd Author's Name Takao Kobayashi
2nd Author's Affiliation Tokyo Institute of Technology(Tokyo Tech)
Date 2018-01-20
Paper # SP2017-67
Volume (vol) vol.117
Number (no) SP-393
Page pp.pp.5-10(SP),
#Pages 6
Date of Issue 2018-01-13 (SP)