テキスト及び音声からの唇動画像の自動生成

講演名	1998/6/1 テキスト及び音声からの唇動画像の自動生成田村正統, 益子貴史, 小林隆夫,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)
抄録(英)	This paper presents a technique for synthesizing lip movements that synchronize with given utterances based on HMM. In the training stage of the technique, speech unit HMMs are trained with audio and visual parameter vector sequences that represent speech and mouth shapes. Then speech unit HMMs are splitted into speech and visual parameter parts. In the recognition stage, input speech is converted into a transcription and a state sequence using the speech part of the HMMs. In the synthesis stage, a sentence HMM is constructed by concatenating visual parameter part of the HMMs corresponding to the transcription for the given speech. Then an optimum parameter vector sequence in an ML sense is obtained from the sentence HMM. The generated parameter sequence reflects statistical information of both static and dynamic features, and synthetic lip animation becomes quite smooth and natural.
キーワード(和)
キーワード(英)	lip movement systhesis / hidden Markov model / lip synchronization / multimodal interface
資料番号	MVE98-38
発行日

講演論文情報詳細
申込み研究会	Media Experience and Virtual Environment (MVE)
本文の言語	JPN
タイトル（和）	テキスト及び音声からの唇動画像の自動生成
サブタイトル（和）
タイトル（英）	LIP MOVEMENT SYNTHESIS FROM SPEECH AND TEXT
サブタイトル（和）
キーワード(1)（和/英）	/ lip movement systhesis
第 1 著者氏名（和/英）	田村正統 / Masatsune Tamura
第 1 著者所属（和/英）	東京工業大学総合理工学研究科 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
第 2 著者氏名（和/英）	益子貴史 / Takashi Masuko
第 2 著者所属（和/英）	東京工業大学総合理工学研究科 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
第 3 著者氏名（和/英）	小林隆夫 / Takao Kobayashi
第 3 著者所属（和/英）	東京工業大学総合理工学研究科 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
発表年月日	1998/6/1
資料番号	MVE98-38
巻番号（vol）	vol.98
号番号（no）	97
ページ範囲	pp.-
ページ数	6
発行日