Presentation | 2014-12-16 Prosody Correction Preserving Speaker Individuality in English-Read-By-Japanese Speech Synthesis Based on HMM Yuji OSHIMA, Shinnosuke TAKAMICHI, Tomoki TODA, Graham NEUBIG, Sakriani SAKTI, Satoshi NAKAMURA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | To build an English acoustic model that well captures speaker individuality of each Japanese speaker, a framework using English-Read-by-Japanese (ERJ) voices is effective as it enables to directly model speaker-dependent acoustic characteristics. However, naturalness of English speech synthesized by such an ERJ acoustic model is significantly degraded as it is directly affected by prosodic differences and pronunciation errors often caused by differences of a language system between Japanese and English. To synthesize more natural English speech while preserving speaker individuality of individual Japanese speakers, we propose a technique to correct prosody of ERJ voices based on that of a native English speaker. The duration and power of the native English speaker are effectively used to develop the ERJ acoustic model for each Japanese speaker by using model adaptation techniques in HMM-based speech synthesis. The experimental results show that our proposed method is capable of significantly improving naturalness of ERJ synthetic speech while preserving its speaker individuality. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | English-Read-by-Japanese (ERJ) / HMM-based speech synthesis / prosody correction / speaker individuality / model adaptation |
Paper # | SP2014-112 |
Date of Issue |
Conference Information | |
Committee | SP |
---|---|
Conference Date | 2014/12/8(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Speech (SP) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Prosody Correction Preserving Speaker Individuality in English-Read-By-Japanese Speech Synthesis Based on HMM |
Sub Title (in English) | |
Keyword(1) | English-Read-by-Japanese (ERJ) |
Keyword(2) | HMM-based speech synthesis |
Keyword(3) | prosody correction |
Keyword(4) | speaker individuality |
Keyword(5) | model adaptation |
1st Author's Name | Yuji OSHIMA |
1st Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology() |
2nd Author's Name | Shinnosuke TAKAMICHI |
2nd Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology |
3rd Author's Name | Tomoki TODA |
3rd Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology |
4th Author's Name | Graham NEUBIG |
4th Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology |
5th Author's Name | Sakriani SAKTI |
5th Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology |
6th Author's Name | Satoshi NAKAMURA |
6th Author's Affiliation | Graduate School of Information Science, Nara Institute of Science and Technology |
Date | 2014-12-16 |
Paper # | SP2014-112 |
Volume (vol) | vol.114 |
Number (no) | 365 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |