無矛盾逐次変換ネットワークと敵対的生成ネットワークを用いた非母語話者音声変換

Presentation	2017-03-02 Non-native speech conversion with consistency-aware recursive network and generative adversarial network Keisuke Oyamada, Hirokazu Kameoka, Takuhiro Kaneko, Hiroyasu Ando, Kaoru Hiramatsu, Kunio Kashino,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	This paper deals with the problem of automatically modifying the pronunciation of non-native speech. Since the pronunciation characteristics of non-native speakers tend to depend heavily on the context (such as words), conversion rules must be learned from and applied to a sequence of features rather than a single-frame feature. This paper proposes constructing a neural network that allows a sequence of features as an input and an output, and guarantees the consistency between the generated features within overlapping segments. We further propose applying a recently proposed generative adversarial network (GAN)-based post filterto the generated feature sequence with the aim of synthesizing natural sounding speech. Through subjective and quantitative evaluations, we confirmed the superiority of the proposed method over a conventional NN approach in terms of the conversion quality.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)
Paper #	EA2016-139,SIP2016-194,SP2016-134
Date of Issue	2017-02-22 (EA, SIP, SP)

Conference Information
Committee	SP / SIP / EA
Conference Date	2017/3/1(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	Okinawa Industry Support Center
Topics (in Japanese)	(See Japanese page)
Topics (in English)	Speech, Engineering/Electro Acoustics, Signal Processing, and Related Topics
Chair	Kazunori Mano(Shibaura Inst. of Tech.) / Makoto Nakashizuka(Chiba Inst. of Tech.) / Mitsunori Mizumachi(Kyushu Inst. of Tech.)
Vice Chair	Hiroki Mori(Utsunomiya Univ.) / Masahiro Okuda(Univ. of Kitakyushu) / Shogo Muramatsu(Niigata Univ.) / Yoichi Haneda(Univ. of Electro-Comm.) / Suehiro Shimauchi(NTT)
Secretary	Hiroki Mori(Kobe Univ.) / Masahiro Okuda(Shizuoka Univ.) / Shogo Muramatsu(Ritsumeikan Univ.) / Yoichi Haneda(Chiba Inst. of Tech.) / Suehiro Shimauchi(KDDI R&D Labs.)
Assistant	Taichi Asami(NTT) / Kei Hashimoto(Nagoya Inst. of Tech.) / Osamu Watanabe(Takushoku Univ.) / Shigeto Takeoka(Shizuoka Inst. of Science and Tech.) / TREVINO Jorge(Tohoku Univ.)

Paper Information
Registration To	Technical Committee on Speech / Technical Committee on Signal Processing / Technical Committee on Engineering Acoustics
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Non-native speech conversion with consistency-aware recursive network and generative adversarial network
Sub Title (in English)
Keyword(1)
Keyword(2)
Keyword(3)
Keyword(4)
1st Author's Name	Keisuke Oyamada
1st Author's Affiliation	University of Tsukuba(Univ. of Tsukuba)
2nd Author's Name	Hirokazu Kameoka
2nd Author's Affiliation	Nippon Telegraph and Telephone Corporation(NTT)
3rd Author's Name	Takuhiro Kaneko
3rd Author's Affiliation	Nippon Telegraph and Telephone Corporation(NTT)
4th Author's Name	Hiroyasu Ando
4th Author's Affiliation	University of Tsukuba(Univ. of Tsukuba)
5th Author's Name	Kaoru Hiramatsu
5th Author's Affiliation	Nippon Telegraph and Telephone Corporation(NTT)
6th Author's Name	Kunio Kashino
6th Author's Affiliation	Nippon Telegraph and Telephone Corporation(NTT)
Date	2017-03-02
Paper #	EA2016-139,SIP2016-194,SP2016-134
Volume (vol)	vol.116
Number (no)	EA-475,SIP-476,SP-477
Page	pp.pp.315-320(EA), pp.315-320(SIP), pp.315-320(SP),
#Pages	6
Date of Issue	2017-02-22 (EA, SIP, SP)