テンソル表現に基づく任意話者声質変換に対する話者正規化学習の効果(一般セッション,福祉と音声処理,一般)

齊藤 大輔; 峯松 信明; 広瀬 啓吉

Presentation	2012-11-08 Effects of speaker adaptive training on arbitrary speaker conversion based on tensor representation Daisuke SAITO, Nobuaki MINEMATSU, Keikichi HIROSE,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	In this paper, speaker adaptive training techniques are introduced to tensor-based arbitrary speaker conversion. In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC), which is based on an eigenvoice Gaussian mixture model (EV-GMM), was proposed. Although the EVC can effectively construct the conversion model for arbitrary target speakers using only a few utterances, increase of the utterances used to construct the conversion model does not always improve the conversion performance. This is because the EV-GMM method has an inherent problem in representation of GMM supervectors. We previously proposed tensor-based speaker space as a solution for this problem, and realized more flexible control of speaker characteristics. In this paper, to aim larger improvement of the performance of VC, speaker adaptive training and tensor-based speaker representation are integrated. The proposed method can construct the flexible and precise conversion model, and experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Voice conversion / Gaussian mixture model / Tucker decomposition / speaker adaptive training
Paper #	SP2012-72
Date of Issue

Conference Information
Committee	SP
Conference Date	2012/11/1(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Speech (SP)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Effects of speaker adaptive training on arbitrary speaker conversion based on tensor representation
Sub Title (in English)
Keyword(1)	Voice conversion
Keyword(2)	Gaussian mixture model
Keyword(3)	Tucker decomposition
Keyword(4)	speaker adaptive training
1st Author's Name	Daisuke SAITO
1st Author's Affiliation	Graduate School of Information Science and Technology, The University of Tokyo()
2nd Author's Name	Nobuaki MINEMATSU
2nd Author's Affiliation	Graduate School of Engineering, The University of Tokyo
3rd Author's Name	Keikichi HIROSE
3rd Author's Affiliation	Graduate School of Information Science and Technology, The University of Tokyo
Date	2012-11-08
Paper #	SP2012-72
Volume (vol)	vol.112
Number (no)	281
Page	pp.pp.-
#Pages	6
Date of Issue