Presentation | 2008-12-10 Simultaneous Transformation of Duration and Spectrum Using Statistical Models Including Time-Sequence Matching Kaori YUTANI, Yoshihiko NANKAKU, Tomoki TODA, Keiichi TOKUDA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes a simultaneous conversion technique of duration and spectrum based on a statistical model including time-sequence matching. The conventional GMM-based approach cannot perform spectral conversion taking account of speaking rates because it assumes one to one frame matching between source and target features. However, speaker characteristics may also appear in speaking rates. In order to perform duration conversion, we attach duration models to statistical models including time-sequence matching (DPGMM). Since DPGMM can represent two different length sequences directly, the conversion of spectrum and duration can be performed within an integrated framework. In the proposed technique, each mixture component of DPGMM has different duration transformation functions, therefore durations are converted nonlinearly and dependently on spectral information. In a subjective DMOS test, the proposed method is superior to the conventional method. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Voice conversion / Duration conversion / GMM |
Paper # | NLC2008-37,SP2008-92 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2008/12/2(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Simultaneous Transformation of Duration and Spectrum Using Statistical Models Including Time-Sequence Matching |
Sub Title (in English) | |
Keyword(1) | Voice conversion |
Keyword(2) | Duration conversion |
Keyword(3) | GMM |
1st Author's Name | Kaori YUTANI |
1st Author's Affiliation | Department of Computer Science and Engineering, Nagoya Institute of Technology() |
2nd Author's Name | Yoshihiko NANKAKU |
2nd Author's Affiliation | Department of Computer Science and Engineering, Nagoya Institute of Technology |
3rd Author's Name | Tomoki TODA |
3rd Author's Affiliation | Graduate School of Information Science, Nara Institute of Technology |
4th Author's Name | Keiichi TOKUDA |
4th Author's Affiliation | Department of Computer Science and Engineering, Nagoya Institute of Technology |
Date | 2008-12-10 |
Paper # | NLC2008-37,SP2008-92 |
Volume (vol) | vol.108 |
Number (no) | 337 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |