Presentation | 2018-01-20 A study on statistical speech synthesis based on GP-DNN hybrid model Tomoki Koriyama, Takao Kobayashi, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We propose a novel approach to Gaussian process regression (GPR)-based speech synthesisin this paper. Since the conventional GPR-based speech synthesis was based on data partition with a decision tree, a decision tree was bottleneck of the performance of synthetic speech. In contrast, we propose a hybrid model of Gaussian process and deep neural network (DNN). In the hybrid model, DNN extracts context-derived featuresand the output of DNN is used as an input of Gaussian process. The parameters of DNN and GP are optimized using a minibatch-basedstochastic gradient descent method. From the subjective evaluation results, it can be seen that the proposed technique outperforms not only the conventionalGPR-based speech synthesis with decision treesbut also DNN-based speech synthesis. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Gaussian process regression / stochastic variational inference / neural network / statistical parametric speech synthesis |
Paper # | SP2017-67 |
Date of Issue | 2018-01-13 (SP) |
Conference Information | |
Committee | SP / ASJ-H |
---|---|
Conference Date | 2018/1/20(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | The University of Tokyo |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Yoichi Yamashita(Ritsumeikan Univ.) / 平原 達也(富山県立大) |
Vice Chair | Hiroki Mori(Utsunomiya Univ.) / 中川 誠司(千葉大) |
Secretary | Hiroki Mori(Shizuoka Univ.) / 中川 誠司(Meijo Univ.) |
Assistant | Kei Hashimoto(Nagoya Inst. of Tech.) / Satoshi Kobashikawa(NTT) |
Paper Information | |
Registration To | Technical Committee on Speech / Auditory Research Meeting |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A study on statistical speech synthesis based on GP-DNN hybrid model |
Sub Title (in English) | |
Keyword(1) | Gaussian process regression |
Keyword(2) | stochastic variational inference |
Keyword(3) | neural network |
Keyword(4) | statistical parametric speech synthesis |
1st Author's Name | Tomoki Koriyama |
1st Author's Affiliation | Tokyo Institute of Technology(Tokyo Tech) |
2nd Author's Name | Takao Kobayashi |
2nd Author's Affiliation | Tokyo Institute of Technology(Tokyo Tech) |
Date | 2018-01-20 |
Paper # | SP2017-67 |
Volume (vol) | vol.117 |
Number (no) | SP-393 |
Page | pp.pp.5-10(SP), |
#Pages | 6 |
Date of Issue | 2018-01-13 (SP) |