複数話者WaveNetボコーダに関する調査

林 知樹; 小林 和弘; 玉森 聡; 武田 一哉; 戸田 智基

Presentation	2018-01-21 An investigation of multi-speaker WaveNet vocoder Tomoki Hayashi, Kazuhiro Kobayashi, Akira Tamamori, Kazuya Takeda, Tomoki Toda,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	In this paper, we investigate a multi-speaker WaveNet vocoder. In our previous work, we have demonstrated that our proposed speaker-dependent (SD) WaveNet vocoder, which is trained with a single speaker's speech data, is capable of modeling temporal waveform structure, such as phase information, and makes it possible to generate more natural sounding synthetic voices compared to the conventional high-quality vocoder, STRAIGHT. However, it is still difficult to generate synthetic voices of various speakers using the SD-WaveNet due to its speaker-dependent property. Towards the development of speaker-independent WaveNet vocoder, we update the auxiliary features, introduce the noise shaping technique, and apply multi-speaker training techniques to the WaveNet vocoder and investigate their effectiveness. Moreover, we investigate the effectiveness of the amount of training data. The experimental results demonstrate that 1) the multi-speaker WaveNet vocoder is comparable to SD WaveNet in generating known speakers' voices, but it is slightly worse in generating unknown speakers' voices, 2) the multi-speaker WaveNet vocoder outperforms STRAIGHT in generating both known and unknown speakers' voices, and 3) the scores of objective evaluation metrics are improved proportionally to the amount of training data.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	speech synthesis / vocoder / WaveNet
Paper #	SP2017-81
Date of Issue	2018-01-13 (SP)

Conference Information
Committee	SP / ASJ-H
Conference Date	2018/1/20(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	The University of Tokyo
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Yoichi Yamashita(Ritsumeikan Univ.) / 平原達也(富山県立大)
Vice Chair	Hiroki Mori(Utsunomiya Univ.) / 中川誠司(千葉大)
Secretary	Hiroki Mori(Shizuoka Univ.) / 中川誠司(Meijo Univ.)
Assistant	Kei Hashimoto(Nagoya Inst. of Tech.) / Satoshi Kobashikawa(NTT)

Paper Information
Registration To	Technical Committee on Speech / Auditory Research Meeting
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	An investigation of multi-speaker WaveNet vocoder
Sub Title (in English)
Keyword(1)	speech synthesis
Keyword(2)	vocoder
Keyword(3)	WaveNet
1st Author's Name	Tomoki Hayashi
1st Author's Affiliation	Nagoya University(Nagoya Univ.)
2nd Author's Name	Kazuhiro Kobayashi
2nd Author's Affiliation	Nagoya University(Nagoya Univ.)
3rd Author's Name	Akira Tamamori
3rd Author's Affiliation	Nagoya University(Nagoya Univ.)
4th Author's Name	Kazuya Takeda
4th Author's Affiliation	Nagoya University(Nagoya Univ.)
5th Author's Name	Tomoki Toda
5th Author's Affiliation	Nagoya University(Nagoya Univ.)
Date	2018-01-21
Paper #	SP2017-81
Volume (vol)	vol.117
Number (no)	SP-393
Page	pp.pp.81-86(SP),
#Pages	6
Date of Issue	2018-01-13 (SP)