Presentation | 2023-12-03 [Poster Presentation] Self-supervised learning model based emotion transfer and intensity control technology for expressive speech synthesis Wei Li, Nobuaki Minematsu, Daisuke Saito, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Emotion transfer techniques, which transfersba the speaking style from the reference speech to the target speech, are widely used for speech synthesis. However, previous methods using emotion classifier to disentangle the emotion components fail to transfer the correct emotion to the target speech in some contexts. To solve this problem, we introduce self-supervised learning model to improve the capability of emotion feature extraction. In addition, we utilize the relative attributes method to obtain the intensity labels for our emotional speech dataset. Experimental results indicate that our method can improve the performance of emotional speech synthesis model. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Emotion TransferIntensity ControlSelf-supervised Learning ModelSpeech Synthesis |
Paper # | NLC2023-21,SP2023-41 |
Date of Issue | 2023-11-25 (NLC, SP) |
Conference Information | |
Committee | SP / NLC / IPSJ-SLP / IPSJ-NL |
---|---|
Conference Date | 2023/12/2(3days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Kikai-Shinko-Kaikan Bldg. |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Tomoki Toda(Nagoya Univ.) / Mitsuo Yoshida(Univ. of Tsukuba) / 戸田 智基(名古屋大学) / 須藤 克仁(奈良先端科学技術大学院大学) |
Vice Chair | / Hiroki Sakaji(Univ. of Tokyo) / Takeshi Kobayakawa(NHK) |
Secretary | (NTT) / Hiroki Sakaji(Nagoya Inst. of Tech.) / Takeshi Kobayakawa(rinna) / (Hiroshima Univ. of Economics) / (NTT) |
Assistant | Ryo Aihara(Mitsubishi Electric) / Daisuke Saito(Univ. of Tokyo) / Kanjin Takahashi(Sansan) / Yasuhiro Ogawa(Nagoya Univ.) |
Paper Information | |
Registration To | Technical Committee on Speech / Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Spoken Language Processing / Special Interest Group on Natural Language |
---|---|
Language | ENG-JTITLE |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | [Poster Presentation] Self-supervised learning model based emotion transfer and intensity control technology for expressive speech synthesis |
Sub Title (in English) | |
Keyword(1) | Emotion TransferIntensity ControlSelf-supervised Learning ModelSpeech Synthesis |
1st Author's Name | Wei Li |
1st Author's Affiliation | the University of Tokyo(Univ. of Tokyo) |
2nd Author's Name | Nobuaki Minematsu |
2nd Author's Affiliation | the University of Tokyo(Univ. of Tokyo) |
3rd Author's Name | Daisuke Saito |
3rd Author's Affiliation | the University of Tokyo(Univ. of Tokyo) |
Date | 2023-12-03 |
Paper # | NLC2023-21,SP2023-41 |
Volume (vol) | vol.123 |
Number (no) | NLC-291,SP-292 |
Page | pp.pp.43-48(NLC), pp.43-48(SP), |
#Pages | 6 |
Date of Issue | 2023-11-25 (NLC, SP) |