顔特徴量を用いたテキストからのフォトリアリスティック顔動画像生成の検討

佐藤 一樹; 能勢 隆; 伊藤 彰則

講演名	2016-05-19 顔特徴量を用いたテキストからのフォトリアリスティック顔動画像生成の検討佐藤一樹(東北大), 能勢隆(東北大), 伊藤彰則(東北大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	本稿では，フォトリアリスティックな対話エージェント実現に向けた顔動画像合成の手法として，Kinectによる顔特徴量を利用した隠れマルコフモデル (Hidden Markov Model, HMM) に基づく顔動画像合成手法を提案する．近年，対話型エージェントのようにコンピュータと人間が対話する機会が増えつつあり，より人間らしく受け答えのできるエージェントが今後望まれると考えられる．そのために本研究ではエージェントの人間らしさとしてその見た目に着目し，発話内容に同期したフォトリアリスティックな顔動画像の合成を目指す．従来法では HMM 音声合成の枠組みを顔動画像合成に適用する手法が提案されており，合成された顔動画像の品質や学習に用いるデータ作成のコストが高いといった問題点がある．そこで提案法では，Kinect を用いることで取得できる顔の各部位の状態を表した Animation Unit (AU) を特徴量とし，HMM 顔画像合成で得られた顔特徴量を Deep Neural Network (DNN) を用いて輝度値系列へと変換することで顔画像合成を行う．本稿では提案法における HMM，DNN でのパラメータ生成性能について評価を行って最適な学習条件について検討したのち，合成した顔動画像系列を示す．
抄録(英)	In this paper, we propose face moving image synthesis technique based on Hidden Markov model (HMM) using the facial features as a method of face moving image synthesis for the photo-realistic interactive agent implementation. In recent years, it becomes more and more popular for a human to interact with computers using human-like interfaces, such as an interactive agent. In such a situation, realization of more human-like agent is needed. Therefore, focusing on realization of an agent with human-like appearance, we investigate a synthesis method of photo-realistic face moving image synchronized with the speech contents. The conventional method employs the same framework as the HMM speech synthesis for face motion image synthesis; however, there are problems that high cost of data preparation for learning and low quality of the synthesized facial image. In the proposed method, we use Animation Unit (AU) as feature of face image generation, which can be obtained by Kinect. The AU is parameters that express each part of the face. Using AU, we synthesize the face motion image by converting the face feature generated from the HMM face image synthesizer into intensities of pixels using the Deep Neural Network (DNN). In this paper, we investigate the optimal learning conditions for the HMM an
キーワード(和)	顔画像合成 / フォトリアリスティック / HMM / DNN / Kinect
キーワード(英)	face image synthesis / photo-realistic / HMM / DNN / Kinect
資料番号	IT2016-8,EMM2016-8
発行日	2016-05-12 (IT, EMM)

研究会情報
研究会	IT / EMM
開催期間	2016/5/19(から2日開催)
開催地（和）	小樽経済センター
開催地（英）	Otaru Economic Center
テーマ（和）	情報セキュリティ，情報理論，情報ハイディング，一般
テーマ（英）	Information Security, Information Theory, Information Hiding, etc.
委員長氏名（和）	大濱靖匡(電通大) / 伊藤彰則(東北大)
委員長氏名（英）	Yasutada Oohama(Univ. of Electro-Comm.) / Akinori Ito(Tohoku Univ.)
副委員長氏名（和）	和田山正(名工大) / 鵜木祐史(北陸先端大) / 川村正樹(山口大)
副委員長氏名（英）	Tadashi Wadayama(Nagoya Inst. of Tech.) / Masashi Unoki(JAIST) / Masaki Kawamura(Yamaguchi Univ.)
幹事氏名（和）	岩本貢(電通大) / 葛岡成晃(和歌山大) / 市野将嗣(電通大) / 薗田光太郎(長崎大)
幹事氏名（英）	Mitsugu Iwamoto(Univ. of Electro-Comm.) / Shigeaki Kuzuoka(Wakayama Univ.) / Masatsugu Ichino(Univ. of Electro-Comm.) / Kotaro Sonoda(Nagasaki Univ.)
幹事補佐氏名（和）	日下卓也(岡山大) / 岩田基(阪府大) / 河野和宏(関西大)
幹事補佐氏名（英）	Takuya Kusaka(Okayama Univ.) / Motoi Iwata(Osaka Pref. Univ.) / Kazuhiro Kohno(Kansai Univ.)

講演論文情報詳細
申込み研究会	Technical Committee on Information Theory / Technical Committee on Enriched MultiMedia
本文の言語	JPN
タイトル（和）	顔特徴量を用いたテキストからのフォトリアリスティック顔動画像生成の検討
サブタイトル（和）
タイトル（英）	Study of Photo-realistic Face Moving Image Generation from the Text Using the Facial Feature
サブタイトル（和）
キーワード(1)（和/英）	顔画像合成 / face image synthesis
キーワード(2)（和/英）	フォトリアリスティック / photo-realistic
キーワード(3)（和/英）	HMM / HMM
キーワード(4)（和/英）	DNN / DNN
キーワード(5)（和/英）	Kinect / Kinect
第 1 著者氏名（和/英）	佐藤一樹 / Kazuki Sato
第 1 著者所属（和/英）	東北大学(略称：東北大) Tohoku University(略称：Tohoku Univ.)
第 2 著者氏名（和/英）	能勢隆 / Takashi Nose
第 2 著者所属（和/英）	東北大学(略称：東北大) Tohoku University(略称：Tohoku Univ.)
第 3 著者氏名（和/英）	伊藤彰則 / Akinori Ito
第 3 著者所属（和/英）	東北大学(略称：東北大) Tohoku University(略称：Tohoku Univ.)
発表年月日	2016-05-19
資料番号	IT2016-8,EMM2016-8
巻番号（vol）	vol.116
号番号（no）	IT-33,EMM-34
ページ範囲	pp.43-48(IT), pp.43-48(EMM),
ページ数	6
発行日	2016-05-12 (IT, EMM)