Presentation 2019-03-18
Talking Head Generation with Deep Phoneme and Viseme Representation and Generative Adversarial Networks
Takaaki Yasui, Yuta Nakashima, Noboru Babaguchi,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we propose to generate talking head given an audio input.Some existing methods generate photorealistic talking head by collecting a large amount of the target speaker's utterance video, but these methods are not applicable when only a small amount of video is available.Our method generates talking head for arbitrary speech of the target speaker with a small amount of utterance video.For doing this, we firstly extract phoneme features and viseme features from utterance video and map these features into a common space.We then train a generative adversarial network (GAN) to generate talking head from the phoneme feature on the common space.These networks are trained with less than 2 min of utterance video.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) talking head generation / phoneme / viseme / generative adversarial network
Paper # BioX2018-53,PRMU2018-157
Date of Issue 2019-03-10 (BioX, PRMU)

Conference Information
Committee PRMU / BioX
Conference Date 2019/3/17(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Shinichi Sato(NII) / Kazuhiko Sumi(AGU)
Vice Chair Yoshihisa Ijiri(Omron) / Toru Tamaki(Hiroshima Univ.) / Hitoshi Imaoka(NEC) / Tetsushi Ohki(Shizuoka Univ.)
Secretary Yoshihisa Ijiri(NEC) / Toru Tamaki(Osaka Univ.) / Hitoshi Imaoka(Fujitsu Labs.) / Tetsushi Ohki(Univ. of Electro-Comm.)
Assistant Go Irie(NTT) / Yoshitaka Ushiku(Univ. of Tokyo) / Norihiro Okui(KDDI Research) / Daishi Watabe(Saitama Inst. of Tech.)

Paper Information
Registration To Technical Committee on Pattern Recognition and Media Understanding / Technical Committee on Biometrics
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Talking Head Generation with Deep Phoneme and Viseme Representation and Generative Adversarial Networks
Sub Title (in English)
Keyword(1) talking head generation
Keyword(2) phoneme
Keyword(3) viseme
Keyword(4) generative adversarial network
1st Author's Name Takaaki Yasui
1st Author's Affiliation Osaka University(Osaka Univ.)
2nd Author's Name Yuta Nakashima
2nd Author's Affiliation Osaka University(Osaka Univ.)
3rd Author's Name Noboru Babaguchi
3rd Author's Affiliation Osaka University(Osaka Univ.)
Date 2019-03-18
Paper # BioX2018-53,PRMU2018-157
Volume (vol) vol.118
Number (no) BioX-512,PRMU-513
Page pp.pp.143-148(BioX), pp.143-148(PRMU),
#Pages 6
Date of Issue 2019-03-10 (BioX, PRMU)