Presentation | 2018-01-21 An investigation of multi-speaker WaveNet vocoder Tomoki Hayashi, Kazuhiro Kobayashi, Akira Tamamori, Kazuya Takeda, Tomoki Toda, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, we investigate a multi-speaker WaveNet vocoder. In our previous work, we have demonstrated that our proposed speaker-dependent (SD) WaveNet vocoder, which is trained with a single speaker's speech data, is capable of modeling temporal waveform structure, such as phase information, and makes it possible to generate more natural sounding synthetic voices compared to the conventional high-quality vocoder, STRAIGHT. However, it is still difficult to generate synthetic voices of various speakers using the SD-WaveNet due to its speaker-dependent property. Towards the development of speaker-independent WaveNet vocoder, we update the auxiliary features, introduce the noise shaping technique, and apply multi-speaker training techniques to the WaveNet vocoder and investigate their effectiveness. Moreover, we investigate the effectiveness of the amount of training data. The experimental results demonstrate that 1) the multi-speaker WaveNet vocoder is comparable to SD WaveNet in generating known speakers' voices, but it is slightly worse in generating unknown speakers' voices, 2) the multi-speaker WaveNet vocoder outperforms STRAIGHT in generating both known and unknown speakers' voices, and 3) the scores of objective evaluation metrics are improved proportionally to the amount of training data. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | speech synthesis / vocoder / WaveNet |
Paper # | SP2017-81 |
Date of Issue | 2018-01-13 (SP) |
Conference Information | |
Committee | SP / ASJ-H |
---|---|
Conference Date | 2018/1/20(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | The University of Tokyo |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Yoichi Yamashita(Ritsumeikan Univ.) / 平原 達也(富山県立大) |
Vice Chair | Hiroki Mori(Utsunomiya Univ.) / 中川 誠司(千葉大) |
Secretary | Hiroki Mori(Shizuoka Univ.) / 中川 誠司(Meijo Univ.) |
Assistant | Kei Hashimoto(Nagoya Inst. of Tech.) / Satoshi Kobashikawa(NTT) |
Paper Information | |
Registration To | Technical Committee on Speech / Auditory Research Meeting |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | An investigation of multi-speaker WaveNet vocoder |
Sub Title (in English) | |
Keyword(1) | speech synthesis |
Keyword(2) | vocoder |
Keyword(3) | WaveNet |
1st Author's Name | Tomoki Hayashi |
1st Author's Affiliation | Nagoya University(Nagoya Univ.) |
2nd Author's Name | Kazuhiro Kobayashi |
2nd Author's Affiliation | Nagoya University(Nagoya Univ.) |
3rd Author's Name | Akira Tamamori |
3rd Author's Affiliation | Nagoya University(Nagoya Univ.) |
4th Author's Name | Kazuya Takeda |
4th Author's Affiliation | Nagoya University(Nagoya Univ.) |
5th Author's Name | Tomoki Toda |
5th Author's Affiliation | Nagoya University(Nagoya Univ.) |
Date | 2018-01-21 |
Paper # | SP2017-81 |
Volume (vol) | vol.117 |
Number (no) | SP-393 |
Page | pp.pp.81-86(SP), |
#Pages | 6 |
Date of Issue | 2018-01-13 (SP) |