Presentation 2014-12-16
Prosody Correction Preserving Speaker Individuality in English-Read-By-Japanese Speech Synthesis Based on HMM
Yuji OSHIMA, Shinnosuke TAKAMICHI, Tomoki TODA, Graham NEUBIG, Sakriani SAKTI, Satoshi NAKAMURA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) To build an English acoustic model that well captures speaker individuality of each Japanese speaker, a framework using English-Read-by-Japanese (ERJ) voices is effective as it enables to directly model speaker-dependent acoustic characteristics. However, naturalness of English speech synthesized by such an ERJ acoustic model is significantly degraded as it is directly affected by prosodic differences and pronunciation errors often caused by differences of a language system between Japanese and English. To synthesize more natural English speech while preserving speaker individuality of individual Japanese speakers, we propose a technique to correct prosody of ERJ voices based on that of a native English speaker. The duration and power of the native English speaker are effectively used to develop the ERJ acoustic model for each Japanese speaker by using model adaptation techniques in HMM-based speech synthesis. The experimental results show that our proposed method is capable of significantly improving naturalness of ERJ synthetic speech while preserving its speaker individuality.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) English-Read-by-Japanese (ERJ) / HMM-based speech synthesis / prosody correction / speaker individuality / model adaptation
Paper # SP2014-112
Date of Issue

Conference Information
Committee SP
Conference Date 2014/12/8(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Prosody Correction Preserving Speaker Individuality in English-Read-By-Japanese Speech Synthesis Based on HMM
Sub Title (in English)
Keyword(1) English-Read-by-Japanese (ERJ)
Keyword(2) HMM-based speech synthesis
Keyword(3) prosody correction
Keyword(4) speaker individuality
Keyword(5) model adaptation
1st Author's Name Yuji OSHIMA
1st Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology()
2nd Author's Name Shinnosuke TAKAMICHI
2nd Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
3rd Author's Name Tomoki TODA
3rd Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
4th Author's Name Graham NEUBIG
4th Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
5th Author's Name Sakriani SAKTI
5th Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
6th Author's Name Satoshi NAKAMURA
6th Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
Date 2014-12-16
Paper # SP2014-112
Volume (vol) vol.114
Number (no) 365
Page pp.pp.-
#Pages 6
Date of Issue