Presentation 2021-03-03
[Poster Presentation] Investigation of DNN-based speech synthesis utilizing oral reading skills obtained from large scale subjective evaluation
Shun Akui, Yusuke Ijima, Daisuke Saito, Nobuaki Minematsu,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) So far, we have been suggested the value of `oral reading skill' based on a listening evaluation experiment as a quantitative index which represents how much the reading voice is heard as a professional narrator's one. In this paper, we attempt to utilize such information of skill for DNN-based speech synthesis by adding the value of oral reading skill to the input of the multispeaker DNN speech synthesis model. This can be expected to manipulate the reading skill of synthesized voice without changing its individuality. We considered different patterns of hidden layers the value of oral reading skill is added to. For each case, we discussed by objective evaluation and subjective evaluation based on listening experiment whether the reading skill of the synthesized voice changes as expected with its naturalness and individuality preserved.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) speech synthesis / deep neural network / oral reading skill
Paper # EA2020-71,SIP2020-102,SP2020-36
Date of Issue 2021-02-24 (EA, SIP, SP)

Conference Information
Committee EA / US / SP / SIP / IPSJ-SLP
Conference Date 2021/3/3(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English) Speech, Engineering/Electro Acoustics, Signal Processing, Ultrasonics, and Related Topics
Chair Kenichi Furuya(Oita Univ.) / Hikaru Miura(Nihon Univ.) / Hisashi Kawai(NICT) / Kazunori Hayashi(Kyoto Univ.) / 北岡 教英(豊橋技科大)
Vice Chair Yoshinobu Kajikawa(Kansai Univ.) / Kentaro Matsui(NHK) / Jun Kondo(Shizuoka Univ.) / Yoshikazu Koike(Shibaura Inst. of Tech.) / / Yukihiro Bandou(NTT) / Toshihisa Tanaka(Tokyo Univ. Agri.&Tech.)
Secretary Yoshinobu Kajikawa(Univ. of Tokyo) / Kentaro Matsui(NTT) / Jun Kondo(Doshisha Univ.) / Yoshikazu Koike(Tohoku Univ.) / (Univ. of Tokyo) / Yukihiro Bandou(Waseda Univ.) / Toshihisa Tanaka(Hosei Univ.) / (Waseda Univ.)
Assistant Yukou Wakabayashi(Tokyo Metropolitan Univ.) / Tatsuya Komatsu(LINE) / Shinnosuke Hirata(Tokyo Inst. of Tech.) / Yusuke Ijima(NTT) / Yuichi Tanaka(Tokyo Univ. Agri.&Tech.)

Paper Information
Registration To Technical Committee on Engineering Acoustics / Technical Committee on Ultrasonics / Technical Committee on Speech / Technical Committee on Signal Processing / Special Interest Group on Spoken Language Processing
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) [Poster Presentation] Investigation of DNN-based speech synthesis utilizing oral reading skills obtained from large scale subjective evaluation
Sub Title (in English)
Keyword(1) speech synthesis
Keyword(2) deep neural network
Keyword(3) oral reading skill
1st Author's Name Shun Akui
1st Author's Affiliation The University of Tokyo(UTokyo)
2nd Author's Name Yusuke Ijima
2nd Author's Affiliation Nippon Telegraph and Telephone Corporation(NTT)
3rd Author's Name Daisuke Saito
3rd Author's Affiliation The University of Tokyo(UTokyo)
4th Author's Name Nobuaki Minematsu
4th Author's Affiliation The University of Tokyo(UTokyo)
Date 2021-03-03
Paper # EA2020-71,SIP2020-102,SP2020-36
Volume (vol) vol.120
Number (no) EA-397,SIP-398,SP-399
Page pp.pp.68-73(EA), pp.68-73(SIP), pp.68-73(SP),
#Pages 6
Date of Issue 2021-02-24 (EA, SIP, SP)