Presentation 2009-06-25
A mean F_0 speaker adaptation method for regression model-based F_0 contour generation
Hosana KAMIYAMA, Takahiro SHINOZAKI, Koji IWANO, Sadaoki FURUI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper proposes a new speaker adaptation method for the fundamental frequency (F_0) contour generation models based on the Quantification Theory (Type I). In this method, natural F_0 contour producing models for standard Japanese are trained using a large amount of speech data from many speakers, and natural as well as speaker-specific F_0 contours are generated by adapting mean F_0 values using a small amount of speech data from a specific speaker. Objective evaluation results using the models made by the proposed method confirm that around five sentences are enough for speaker adaptation. Subjective evaluation results confirm that naturalness of the synthesized speech using models adapted by 50 sentences is almost equivalent to that of the synthesized speech using models trained by 450 sentences for a specific speaker. These results indicate that the proposed adaptation method can produce highly natural synthesized speech.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) HMM-based Speech Synthesis / Quantification Theory (Type I) / F_0 Contour Generation / Prosody Control / Speaker Adaptation
Paper # SP2009-38
Date of Issue

Conference Information
Committee SP
Conference Date 2009/6/17(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A mean F_0 speaker adaptation method for regression model-based F_0 contour generation
Sub Title (in English)
Keyword(1) HMM-based Speech Synthesis
Keyword(2) Quantification Theory (Type I)
Keyword(3) F_0 Contour Generation
Keyword(4) Prosody Control
Keyword(5) Speaker Adaptation
1st Author's Name Hosana KAMIYAMA
1st Author's Affiliation Department of Computer Science, Tokyo Institute of Technology()
2nd Author's Name Takahiro SHINOZAKI
2nd Author's Affiliation Department of Computer Science, Tokyo Institute of Technology
3rd Author's Name Koji IWANO
3rd Author's Affiliation Faculty of Environmental and Information Studies, Tokyo City University
4th Author's Name Sadaoki FURUI
4th Author's Affiliation Department of Computer Science, Tokyo Institute of Technology
Date 2009-06-25
Paper # SP2009-38
Volume (vol) vol.109
Number (no) 99
Page pp.pp.-
#Pages 6
Date of Issue