Presentation | 2019-12-06 A comparison of neural vocoders in singing voice synthesis Sota Wada, Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this study, we compare five types of vocoders based on neural networks (neural vocoders) for singing voice synthesis. In recent years, WaveNet vocoder has been proposed as a neural vocoder. WaveNet vocoder can model speech waveforms with high accuracy and generate natural sounding speech. However there is a problem that WaveNet vocoder cannot synthesize speech in real time due to its autoregressive structure. To address this problem, two approaches have been proposed. The first approach is to reduce the model structure of the autoregressive models. This increases the efficiency of sampling from the models and allows faster synthesis than real time. The second approach is to synthesize multiple samples simultaneously by using flow-based generative models.The performance of these methods has been investigated using normal utterances, and no singing voice has been used yet. Therefore, in this paper, we compare the performance of five types of neural vocoders for singing voice synthesis. The results of subjective and objective evaluation experiments show that WaveRNN is an appropriate neural vocoder when emphasizing naturalness, and WaveNet is appropriate if emphasizing reproducibility of pitch and vibrato. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | DNN / Singing voice synthesis / Neural vocoder / WaveNet |
Paper # | SP2019-42 |
Date of Issue | 2019-11-29 (SP) |
Conference Information | |
Committee | NLC / IPSJ-NL / SP / IPSJ-SLP |
---|---|
Conference Date | 2019/12/4(3days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | NHK Science & Technology Research Labs. |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | The 6th Natural Language Processing Symposium & The 21th Spoken Language Symposium |
Chair | Takeshi Sakaki(Hottolink) / / Hisashi Kawai(NICT) |
Vice Chair | Mitsuo Yoshida(Toyohashi Univ. of Tech.) / Kazutaka Shimada(Kyushu Inst. of Tech.) / / Akinobu Ri(Nagoya Inst. of Tech.) |
Secretary | Mitsuo Yoshida(Ryukoku Univ.) / Kazutaka Shimada(NTT) / / Akinobu Ri(Kyoto Univ.) / (Waseda Univ.) |
Assistant | Takeshi Kobayakawa(NHK) / Hiroki Sakaji(Univ. of Tokyo) / / Tomoki Koriyama(Univ. of Tokyo) / Yusuke Ijima(NTT) |
Paper Information | |
Registration To | Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Natural Language / Technical Committee on Speech / Special Interest Group on Spoken Language Processing |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A comparison of neural vocoders in singing voice synthesis |
Sub Title (in English) | |
Keyword(1) | DNN |
Keyword(2) | Singing voice synthesis |
Keyword(3) | Neural vocoder |
Keyword(4) | WaveNet |
1st Author's Name | Sota Wada |
1st Author's Affiliation | Nagoya Institute of Technology(Nagoya Inst. of Tech.) |
2nd Author's Name | Yukiya Hono |
2nd Author's Affiliation | Nagoya Institute of Technology(Nagoya Inst. of Tech.) |
3rd Author's Name | Shinji Takaki |
3rd Author's Affiliation | Nagoya Institute of Technology(Nagoya Inst. of Tech.) |
4th Author's Name | Kei Hashimoto |
4th Author's Affiliation | Nagoya Institute of Technology(Nagoya Inst. of Tech.) |
5th Author's Name | Keiichiro Oura |
5th Author's Affiliation | Nagoya Institute of Technology(Nagoya Inst. of Tech.) |
6th Author's Name | Yoshihiko Nankaku |
6th Author's Affiliation | Nagoya Institute of Technology(Nagoya Inst. of Tech.) |
7th Author's Name | Keiichi Tokuda |
7th Author's Affiliation | Nagoya Institute of Technology(Nagoya Inst. of Tech.) |
Date | 2019-12-06 |
Paper # | SP2019-42 |
Volume (vol) | vol.119 |
Number (no) | SP-321 |
Page | pp.pp.85-90(SP), |
#Pages | 6 |
Date of Issue | 2019-11-29 (SP) |