Presentation 2009-01-30
Low-delay voice conversion algorithm based on maximum likelihood estimation of spectral parameter trajectory
Takashi MURAMATSU, Yamato OHTANI, Tomoki TODA, Hiroshi SARUWATARI, Kiyohiro SHIKANO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we aim to achieve high-quality and real-time VC considering spectral conversion method and post-processing of spectral conversion. As typical voice conversion methods, two spectral conversion processes have been proposed: 1) the frame-based conversion that converts spectral parameters frame by frame and 2) the trajectory-based conversion that converts all spectral parameters over an utterance simultaneously. The former process is capable of real-time conversion but it sometimes causes inappropriate spectral movements. On the other hand, the latter process provides the converted spectral parameters exhibiting proper dynamic characteristics but it isn't capable of real-time conversion. To realize the real-time conversion process considering spectral dynamic characteristics, we propose a time-recursive conversion algorithm based on maximum likelihood estimation of spectral parameter trajectory. And, the converted trajectory is often excessively smoothed due to the statistical processing. Although the maximum likelihood feature conversion method which considers global variance (GV) is proposed, it is complicated to apply to the low-delay conversion. In this paper, we propose a technique using post-filter which considers GV. Experimental results show that the proposed methods are effective.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) speech synthesis / voice conversion / Gaussian mixture model / maximum likelihood estimation / low-delay processing
Paper # SP2008-141
Date of Issue

Conference Information
Committee SP
Conference Date 2009/1/22(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Low-delay voice conversion algorithm based on maximum likelihood estimation of spectral parameter trajectory
Sub Title (in English)
Keyword(1) speech synthesis
Keyword(2) voice conversion
Keyword(3) Gaussian mixture model
Keyword(4) maximum likelihood estimation
Keyword(5) low-delay processing
1st Author's Name Takashi MURAMATSU
1st Author's Affiliation Nara Institute of Science and Technology()
2nd Author's Name Yamato OHTANI
2nd Author's Affiliation Nara Institute of Science and Technology
3rd Author's Name Tomoki TODA
3rd Author's Affiliation Nara Institute of Science and Technology
4th Author's Name Hiroshi SARUWATARI
4th Author's Affiliation Nara Institute of Science and Technology
5th Author's Name Kiyohiro SHIKANO
5th Author's Affiliation Nara Institute of Science and Technology
Date 2009-01-30
Paper # SP2008-141
Volume (vol) vol.108
Number (no) 422
Page pp.pp.-
#Pages 6
Date of Issue