Presentation | 2009-01-30 Low-delay voice conversion algorithm based on maximum likelihood estimation of spectral parameter trajectory Takashi MURAMATSU, Yamato OHTANI, Tomoki TODA, Hiroshi SARUWATARI, Kiyohiro SHIKANO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, we aim to achieve high-quality and real-time VC considering spectral conversion method and post-processing of spectral conversion. As typical voice conversion methods, two spectral conversion processes have been proposed: 1) the frame-based conversion that converts spectral parameters frame by frame and 2) the trajectory-based conversion that converts all spectral parameters over an utterance simultaneously. The former process is capable of real-time conversion but it sometimes causes inappropriate spectral movements. On the other hand, the latter process provides the converted spectral parameters exhibiting proper dynamic characteristics but it isn't capable of real-time conversion. To realize the real-time conversion process considering spectral dynamic characteristics, we propose a time-recursive conversion algorithm based on maximum likelihood estimation of spectral parameter trajectory. And, the converted trajectory is often excessively smoothed due to the statistical processing. Although the maximum likelihood feature conversion method which considers global variance (GV) is proposed, it is complicated to apply to the low-delay conversion. In this paper, we propose a technique using post-filter which considers GV. Experimental results show that the proposed methods are effective. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | speech synthesis / voice conversion / Gaussian mixture model / maximum likelihood estimation / low-delay processing |
Paper # | SP2008-141 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2009/1/22(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Low-delay voice conversion algorithm based on maximum likelihood estimation of spectral parameter trajectory |
Sub Title (in English) | |
Keyword(1) | speech synthesis |
Keyword(2) | voice conversion |
Keyword(3) | Gaussian mixture model |
Keyword(4) | maximum likelihood estimation |
Keyword(5) | low-delay processing |
1st Author's Name | Takashi MURAMATSU |
1st Author's Affiliation | Nara Institute of Science and Technology() |
2nd Author's Name | Yamato OHTANI |
2nd Author's Affiliation | Nara Institute of Science and Technology |
3rd Author's Name | Tomoki TODA |
3rd Author's Affiliation | Nara Institute of Science and Technology |
4th Author's Name | Hiroshi SARUWATARI |
4th Author's Affiliation | Nara Institute of Science and Technology |
5th Author's Name | Kiyohiro SHIKANO |
5th Author's Affiliation | Nara Institute of Science and Technology |
Date | 2009-01-30 |
Paper # | SP2008-141 |
Volume (vol) | vol.108 |
Number (no) | 422 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |