Presentation 2018-01-21
An investigation of multi-speaker WaveNet vocoder
Tomoki Hayashi, Kazuhiro Kobayashi, Akira Tamamori, Kazuya Takeda, Tomoki Toda,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we investigate a multi-speaker WaveNet vocoder. In our previous work, we have demonstrated that our proposed speaker-dependent (SD) WaveNet vocoder, which is trained with a single speaker's speech data, is capable of modeling temporal waveform structure, such as phase information, and makes it possible to generate more natural sounding synthetic voices compared to the conventional high-quality vocoder, STRAIGHT. However, it is still difficult to generate synthetic voices of various speakers using the SD-WaveNet due to its speaker-dependent property. Towards the development of speaker-independent WaveNet vocoder, we update the auxiliary features, introduce the noise shaping technique, and apply multi-speaker training techniques to the WaveNet vocoder and investigate their effectiveness. Moreover, we investigate the effectiveness of the amount of training data. The experimental results demonstrate that 1) the multi-speaker WaveNet vocoder is comparable to SD WaveNet in generating known speakers' voices, but it is slightly worse in generating unknown speakers' voices, 2) the multi-speaker WaveNet vocoder outperforms STRAIGHT in generating both known and unknown speakers' voices, and 3) the scores of objective evaluation metrics are improved proportionally to the amount of training data.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) speech synthesis / vocoder / WaveNet
Paper # SP2017-81
Date of Issue 2018-01-13 (SP)

Conference Information
Committee SP / ASJ-H
Conference Date 2018/1/20(2days)
Place (in Japanese) (See Japanese page)
Place (in English) The University of Tokyo
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Yoichi Yamashita(Ritsumeikan Univ.) / 平原 達也(富山県立大)
Vice Chair Hiroki Mori(Utsunomiya Univ.) / 中川 誠司(千葉大)
Secretary Hiroki Mori(Shizuoka Univ.) / 中川 誠司(Meijo Univ.)
Assistant Kei Hashimoto(Nagoya Inst. of Tech.) / Satoshi Kobashikawa(NTT)

Paper Information
Registration To Technical Committee on Speech / Auditory Research Meeting
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) An investigation of multi-speaker WaveNet vocoder
Sub Title (in English)
Keyword(1) speech synthesis
Keyword(2) vocoder
Keyword(3) WaveNet
1st Author's Name Tomoki Hayashi
1st Author's Affiliation Nagoya University(Nagoya Univ.)
2nd Author's Name Kazuhiro Kobayashi
2nd Author's Affiliation Nagoya University(Nagoya Univ.)
3rd Author's Name Akira Tamamori
3rd Author's Affiliation Nagoya University(Nagoya Univ.)
4th Author's Name Kazuya Takeda
4th Author's Affiliation Nagoya University(Nagoya Univ.)
5th Author's Name Tomoki Toda
5th Author's Affiliation Nagoya University(Nagoya Univ.)
Date 2018-01-21
Paper # SP2017-81
Volume (vol) vol.117
Number (no) SP-393
Page pp.pp.81-86(SP),
#Pages 6
Date of Issue 2018-01-13 (SP)