Presentation | 2012-11-08 Effects of speaker adaptive training on arbitrary speaker conversion based on tensor representation Daisuke SAITO, Nobuaki MINEMATSU, Keikichi HIROSE, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, speaker adaptive training techniques are introduced to tensor-based arbitrary speaker conversion. In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC), which is based on an eigenvoice Gaussian mixture model (EV-GMM), was proposed. Although the EVC can effectively construct the conversion model for arbitrary target speakers using only a few utterances, increase of the utterances used to construct the conversion model does not always improve the conversion performance. This is because the EV-GMM method has an inherent problem in representation of GMM supervectors. We previously proposed tensor-based speaker space as a solution for this problem, and realized more flexible control of speaker characteristics. In this paper, to aim larger improvement of the performance of VC, speaker adaptive training and tensor-based speaker representation are integrated. The proposed method can construct the flexible and precise conversion model, and experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Voice conversion / Gaussian mixture model / Tucker decomposition / speaker adaptive training |
Paper # | SP2012-72 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2012/11/1(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Effects of speaker adaptive training on arbitrary speaker conversion based on tensor representation |
Sub Title (in English) | |
Keyword(1) | Voice conversion |
Keyword(2) | Gaussian mixture model |
Keyword(3) | Tucker decomposition |
Keyword(4) | speaker adaptive training |
1st Author's Name | Daisuke SAITO |
1st Author's Affiliation | Graduate School of Information Science and Technology, The University of Tokyo() |
2nd Author's Name | Nobuaki MINEMATSU |
2nd Author's Affiliation | Graduate School of Engineering, The University of Tokyo |
3rd Author's Name | Keikichi HIROSE |
3rd Author's Affiliation | Graduate School of Information Science and Technology, The University of Tokyo |
Date | 2012-11-08 |
Paper # | SP2012-72 |
Volume (vol) | vol.112 |
Number (no) | 281 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |