Presentation | 2018-07-02 Transforming the Emotion in Speech using CycleGAN Kenji Yasuda, Ryohei Orihara, Yuichi Sei, Yasuyuki Tahara, Akihiko Ohsuga, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In domain transfer task deep learning makes it possible to generate more natural and highly accurate output. Especially with the advent of GAN(Generative Adversarial Network), Learning of transfers between unspecified domains has become possible. Voice conversion is an example of domain transformation for speech. Voice conversion can be paraphrased as speaker domain transformation, which many studies has been done. However, few studies have focused on transformations other than speakers. When aiming at more natural speech synthesis, it is necessary to study transformations other than speaker. Therefore, In this research, we use a model called CycleGAN to perform voice conversion on emotions. We selected "ANG(anger)", "JOY(joy)", "SAD(sadness)" as a conversion target. As a result of evaluation experiments, the model performs well on conversion to "ANG(anger)". In addition, the model performs well on conversion from "JOY(joy)". |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Deep Learning / Domain Transfer / Generative Adversarial Network / Voice Conversion / Speech Processing |
Paper # | AI2018-11 |
Date of Issue | 2018-06-25 (AI) |
Conference Information | |
Committee | AI |
---|---|
Conference Date | 2018/7/2(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Tsunenori Mine(Kyushu Univ.) |
Vice Chair | Daisuke Katagami(Tokyo Polytechnic Univ.) / Naoki Fukuta(Shizuoka Univ.) |
Secretary | Daisuke Katagami(Ritsumeikan Univ.) / Naoki Fukuta(Univ. of Electro-Comm.) |
Assistant | Yuko Sakurai(AIST) |
Paper Information | |
Registration To | Technical Committee on Artificial Intelligence and Knowledge-Based Processing |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Transforming the Emotion in Speech using CycleGAN |
Sub Title (in English) | |
Keyword(1) | Deep Learning |
Keyword(2) | Domain Transfer |
Keyword(3) | Generative Adversarial Network |
Keyword(4) | Voice Conversion |
Keyword(5) | Speech Processing |
1st Author's Name | Kenji Yasuda |
1st Author's Affiliation | The University of Electro-Communications(UEC) |
2nd Author's Name | Ryohei Orihara |
2nd Author's Affiliation | The University of Electro-Communications(UEC) |
3rd Author's Name | Yuichi Sei |
3rd Author's Affiliation | The University of Electro-Communications(UEC) |
4th Author's Name | Yasuyuki Tahara |
4th Author's Affiliation | The University of Electro-Communications(UEC) |
5th Author's Name | Akihiko Ohsuga |
5th Author's Affiliation | The University of Electro-Communications(UEC) |
Date | 2018-07-02 |
Paper # | AI2018-11 |
Volume (vol) | vol.118 |
Number (no) | AI-116 |
Page | pp.pp.61-66(AI), |
#Pages | 6 |
Date of Issue | 2018-06-25 (AI) |