Presentation 2020-10-22
[Invited Talk] NHK's activities on Japanese end-to-end speech synthesis
Kiyoshi Kurihara,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The main business of NHK (Japan Broadcasting Corporation) is the production and broadcasting of programs. Many programs are produced daily and a considerable amount of work goes into the production of speech content by many people including announcers, directors, and engineers. To support this work and to provide new speech services, we have been researching speech synthesis using Deep Neural Networks (DNNs). DNN speech synthesis requires a large amount of data for training purposes, so we are also involved in the research of end-to-end speech synthesis to reduce the cost of obtaining this training data and generate high-quality speech. To achieve end-to-end speech synthesis in the Japanese language, we adapted the sequence-to-sequence + attention system of speech synthesis (seq2seq speech synthesis), which has proven results in English, to Japanese and proposed a speech synthesis technique that takes character strings consisting of kana (phonetic) text and prosodic symbols as input based on JEITA IT-4006, symbols for Japanese Text-to-Speech Synthesizer. We also developed a technique that enables control of speaking style by adding tags that express speaking style to the input data of seq2seq speech synthesis. We are developing applications for a speech synthesis system that incorporates these techniques and studying their use in a variety of scenarios. This talk describes these NHK activities in speech synthesis and introduces NHK’s efforts in universal services now being researched and developed at NHK Science & Technology Research Laboratories.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Statistical parametric speech synthesis / End-to-end speech synthesis / Speaking Style / Encoder-Decoder model
Paper # SP2020-11,WIT2020-12
Date of Issue 2020-10-15 (SP, WIT)

Conference Information
Committee WIT / SP / IPSJ-SLP
Conference Date 2020/10/22(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Daisuke Wakatsuki(Tsukuba Univ. of Tech.) / Hisashi Kawai(NICT) / 北岡 教英(豊技大)
Vice Chair Shinji Sakou(Nagoya Inst. of Tech.)
Secretary Shinji Sakou(Saitama Industrial Tech. Center) / (Teikyo Univ.) / (Univ. of Tokyo)
Assistant Manabi Miyagi(Tsukuba Univ. of Tech.) / Minako Hosono(AIST) / Aki Sugano(Nagoya Univ.) / Yusuke Ijima(NTT)

Paper Information
Registration To Technical Committee on Well-being Information Technology / Technical Committee on Speech / Special Interest Group on Spoken Language Processing
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) [Invited Talk] NHK's activities on Japanese end-to-end speech synthesis
Sub Title (in English)
Keyword(1) Statistical parametric speech synthesis
Keyword(2) End-to-end speech synthesis
Keyword(3) Speaking Style
Keyword(4) Encoder-Decoder model
1st Author's Name Kiyoshi Kurihara
1st Author's Affiliation NHK (Japan Broadcasting Corporation)(NHK)
Date 2020-10-22
Paper # SP2020-11,WIT2020-12
Volume (vol) vol.120
Number (no) SP-197,WIT-198
Page pp.pp.19-20(SP), pp.19-20(WIT),
#Pages 2
Date of Issue 2020-10-15 (SP, WIT)