Presentation 2018-07-02
Transforming the Emotion in Speech using CycleGAN
Kenji Yasuda, Ryohei Orihara, Yuichi Sei, Yasuyuki Tahara, Akihiko Ohsuga,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In domain transfer task deep learning makes it possible to generate more natural and highly accurate output. Especially with the advent of GAN(Generative Adversarial Network), Learning of transfers between unspecified domains has become possible. Voice conversion is an example of domain transformation for speech. Voice conversion can be paraphrased as speaker domain transformation, which many studies has been done. However, few studies have focused on transformations other than speakers. When aiming at more natural speech synthesis, it is necessary to study transformations other than speaker. Therefore, In this research, we use a model called CycleGAN to perform voice conversion on emotions. We selected "ANG(anger)", "JOY(joy)", "SAD(sadness)" as a conversion target. As a result of evaluation experiments, the model performs well on conversion to "ANG(anger)". In addition, the model performs well on conversion from "JOY(joy)".
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Deep Learning / Domain Transfer / Generative Adversarial Network / Voice Conversion / Speech Processing
Paper # AI2018-11
Date of Issue 2018-06-25 (AI)

Conference Information
Committee AI
Conference Date 2018/7/2(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Tsunenori Mine(Kyushu Univ.)
Vice Chair Daisuke Katagami(Tokyo Polytechnic Univ.) / Naoki Fukuta(Shizuoka Univ.)
Secretary Daisuke Katagami(Ritsumeikan Univ.) / Naoki Fukuta(Univ. of Electro-Comm.)
Assistant Yuko Sakurai(AIST)

Paper Information
Registration To Technical Committee on Artificial Intelligence and Knowledge-Based Processing
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Transforming the Emotion in Speech using CycleGAN
Sub Title (in English)
Keyword(1) Deep Learning
Keyword(2) Domain Transfer
Keyword(3) Generative Adversarial Network
Keyword(4) Voice Conversion
Keyword(5) Speech Processing
1st Author's Name Kenji Yasuda
1st Author's Affiliation The University of Electro-Communications(UEC)
2nd Author's Name Ryohei Orihara
2nd Author's Affiliation The University of Electro-Communications(UEC)
3rd Author's Name Yuichi Sei
3rd Author's Affiliation The University of Electro-Communications(UEC)
4th Author's Name Yasuyuki Tahara
4th Author's Affiliation The University of Electro-Communications(UEC)
5th Author's Name Akihiko Ohsuga
5th Author's Affiliation The University of Electro-Communications(UEC)
Date 2018-07-02
Paper # AI2018-11
Volume (vol) vol.118
Number (no) AI-116
Page pp.pp.61-66(AI),
#Pages 6
Date of Issue 2018-06-25 (AI)