Presentation 2016-03-28
[Poster Presentation] An evaluation of F0 transformation for statistical singing voice conversion based on spectral differential filtering
Kazuhiro Kobayashi, Tomoki Toda, Satoshi Nakamura,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this report, we propose a technique for cross-gender statistical singing voice conversion (SVC) with direct waveform modification based on spectrum differential (DIFFSVC). SVC makes it possible to convert voice timbre of a source singer into that of a target singer based on a statistical conversion function of acoustic features between these two singers. A traditional SVC framework usually degrades speech quality of the converted singing voice compared to that of a natural singing voice due to waveform generation with vocoder, which causes various errors. To address this issue, the DIFFSVC technique has been proposed as a high quality SVC framework for within-gender conversion by directly using an excitation signal of the input natural singing voice. To make it possible to also apply this SVC framework to cross-gender conversion, in this report, we apply F0 transformation of the excitation signal based on direct waveform modification to DIFFSVC. The experimental results demonstrate that the proposed cross-gender DIFFSVC framework significantly improves speech quality while while preserving the conversion accuracy of singer identity compared to the conventional SVC.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) statistical singing voice conversion / cross-gender conversion / direct waveform modification / spectral differential / F0 transformation.
Paper # EA2015-84,SIP2015-133,SP2015-112
Date of Issue 2016-03-21 (EA, SIP, SP)

Conference Information
Committee EA / SP / SIP
Conference Date 2016/3/28(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Beppu International Convention Center B-ConPlaza
Topics (in Japanese) (See Japanese page)
Topics (in English) Engineering/Electro Acoustics, Speech, Signal Processing, and Related Topics
Chair Yoichi Haneda(Univ. of Electro-Comm.) / Kazunori Mano(Shibaura Inst. of Tech.) / Osamu Houshuyama(NEC)
Vice Chair Yukio Iwaya(Tohoku Gakuin Univ.) / Mitsunori Mizumachi(Kyushu Inst. of Tech.) / Norihide Kitaoka(Tokushima Univ.) / Makoto Nakashizuka(Chiba Inst. of Tech.) / Masahiro Okuda(Univ. of Kitakyushu)
Secretary Yukio Iwaya(NTT) / Mitsunori Mizumachi(KDDI R&D Labs.) / Norihide Kitaoka(Tokyo City Univ.) / Makoto Nakashizuka(Kobe Univ.) / Masahiro Okuda(NEC)
Assistant Shoichi Koyama(Univ. of Tokyo) / Takashi Nose(Tohoku Univ.) / Taichi Asami(NTT) / Takamichi Miyata(Chiba Inst. of Tech.)

Paper Information
Registration To Technical Committee on Engineering Acoustics / Technical Committee on Speech / Technical Committee on Signal Processing
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) [Poster Presentation] An evaluation of F0 transformation for statistical singing voice conversion based on spectral differential filtering
Sub Title (in English)
Keyword(1) statistical singing voice conversion
Keyword(2) cross-gender conversion
Keyword(3) direct waveform modification
Keyword(4) spectral differential
Keyword(5) F0 transformation.
1st Author's Name Kazuhiro Kobayashi
1st Author's Affiliation Nara Institute of Science and Technology(NAIST)
2nd Author's Name Tomoki Toda
2nd Author's Affiliation Nagoya University/Nara Institute of Science and Technology(Nagoya Univ./NAIST)
3rd Author's Name Satoshi Nakamura
3rd Author's Affiliation Nara Institute of Science and Technology(NAIST)
Date 2016-03-28
Paper # EA2015-84,SIP2015-133,SP2015-112
Volume (vol) vol.115
Number (no) EA-521,SIP-522,SP-523
Page pp.pp.105-110(EA), pp.105-110(SIP), pp.105-110(SP),
#Pages 6
Date of Issue 2016-03-21 (EA, SIP, SP)