Presentation 2008-10-23
An MRHSMM-based voice quality control technique for synthetic speech using speaker adaptation from average voice model
Makoto TACHIBANA, Akifumi KOUNO, Takashi NOSE, Takao KOBAYASHI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper describes a technique for controlling voice quality of synthetic speech using multiple-regression hidden semi-Markov model (MRHSMM). To achieve voice quality control with a small amount of training data, we incorporate a speaker adaptation technique from an average voice model into MRHSMM-based voice quality control. In the proposed technique, we first adapt the average voice model to respective training speakers using a small amount of adaptation data. Then, using obtained speaker-adapted HSMMs and low-dimensional voice quality control vector for each training speaker, the regression matrices of MRHSMM are estimated based on least square method and maximum likelihood estimation. We attempt to control voice quality of synthetic speech using 20 speakers' data of 50 sentences for each speaker. From results of subjective evaluation, we show that the proposed technique can control several voice qualities of synthetic speech. Furthermore, we propose model interpolation technique for the MRHSMMs and show its evaluation results.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) HMM-based speech synthesis / voice quality control / multiple-regression HSMM / average voice model / speaker adaptation
Paper # SP2008-63
Date of Issue

Conference Information
Committee SP
Conference Date 2008/10/16(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) An MRHSMM-based voice quality control technique for synthetic speech using speaker adaptation from average voice model
Sub Title (in English)
Keyword(1) HMM-based speech synthesis
Keyword(2) voice quality control
Keyword(3) multiple-regression HSMM
Keyword(4) average voice model
Keyword(5) speaker adaptation
1st Author's Name Makoto TACHIBANA
1st Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology()
2nd Author's Name Akifumi KOUNO
2nd Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
3rd Author's Name Takashi NOSE
3rd Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
4th Author's Name Takao KOBAYASHI
4th Author's Affiliation Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
Date 2008-10-23
Paper # SP2008-63
Volume (vol) vol.108
Number (no) 265
Page pp.pp.-
#Pages 6
Date of Issue