横顔の発話シーンを用いた口形コード法に基づく単語読唇(ネットワークプロセッサ,通信のための信号処理,無線LAN/PAN,一般)

沖田 慎介; 佐藤 優輝; 菅田 雄希; 田阪 琢朗; 浜田 望

講演名	2012-03-08 横顔の発話シーンを用いた口形コード法に基づく単語読唇(ネットワークプロセッサ,通信のための信号処理,無線LAN/PAN,一般) 沖田慎介, 佐藤優輝, 菅田雄希, 田阪琢朗, 浜田望,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	本研究では、読唇手法のひとつである口形コード法を発話者の横顔の発話シーンから得られた形状特徴量時系列に適用し、従来の母音キーフレームに追加して子音キーフレームの自動検出法を提案する.上下唇の距離と下唇突起長の差分値である横顔形状特徴量の時間的変化より子音キーフレームを検出することで、従来の母音のみによる口形コード時系列推移表現を拡張する.キーフレームの口形認識は、上唇高さ,下唇高さ,上唇突起長,下唇突起長,口唇角度の5特徴量を用いて行う.これより得られる単語コード列と候補単語のコード列に対して、DPマッチングを行い、最近傍となる候補単語を発話単語として推定する.常用27単語と類似単語10ペアの認識対象単語群を用いて2つの認識実験を行った結果、それぞれ90.4%,86.7%の高い認識率を得た.
抄録(英)	In this paper, we apply mouth-shape-approach to Japanese speaker's utterance profile for lip reading.The novel point is to propose automatic detection of consonant-key-frames. To detect the consonant-key-frames by time series of profile feature vector which is defined the difference value of distance of lips and projection length of lower lip. This approach provides an extension of mouth-shape-code time series. The mouth-shape recognition of key-frames is conducted by five profile shape features; the height of upper lip and lower lip, the projection length of upper and lower lip points, and the angle of lips. We apply DP-matching to the recognized word code string of key-frames and a candidate word code string, then search the nearest word as the result. Recognition experiments using two sets of target 27 words commonly used in dairy conversation, and adding 10 pairs of similar words to them are conducted. The proposed method attained 90.4%, and 86.7% for these word set respectively.
キーワード(和)	単語読唇 / 口形コード法 / キーフレーム検出 / 横顔 / 画像処理
キーワード(英)	lip reading / mouth-shape-code / key-frame / profile / image processing
資料番号	CAS2011-112,SIP2011-132,CS2011-104
発行日

研究会情報
研究会	CAS
開催期間	2012/3/1(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Circuits and Systems (CAS)
本文の言語	JPN
タイトル（和）	横顔の発話シーンを用いた口形コード法に基づく単語読唇(ネットワークプロセッサ,通信のための信号処理,無線LAN/PAN,一般)
サブタイトル（和）
タイトル（英）	Word lip reading from scenes of speaker's utterance profile based on mouth-shape-code approach
サブタイトル（和）
キーワード(1)（和/英）	単語読唇 / lip reading
キーワード(2)（和/英）	口形コード法 / mouth-shape-code
キーワード(3)（和/英）	キーフレーム検出 / key-frame
キーワード(4)（和/英）	横顔 / profile
キーワード(5)（和/英）	画像処理 / image processing
第 1 著者氏名（和/英）	沖田慎介 / Shinsuke OKITA
第 1 著者所属（和/英）	慶應義塾大学理工学部システムデザイン工学科 Department of System Design Engineering, Faculty of Science and Technology, Keio University
第 2 著者氏名（和/英）	佐藤優輝 / Yuki SATO
第 2 著者所属（和/英）	慶應義塾大学院理工学研究科総合デザイン工学専攻 Signal processing Lab, School of Integrated Design Engineering, Keio University
第 3 著者氏名（和/英）	菅田雄希 / Yuki SUGATA
第 3 著者所属（和/英）	慶應義塾大学院理工学研究科総合デザイン工学専攻 Signal processing Lab, School of Integrated Design Engineering, Keio University
第 4 著者氏名（和/英）	田阪琢朗 / Takuro TASAKA
第 4 著者所属（和/英）	慶應義塾大学院理工学研究科総合デザイン工学専攻 Signal processing Lab, School of Integrated Design Engineering, Keio University
第 5 著者氏名（和/英）	浜田望 / Nozomu HAMADA
第 5 著者所属（和/英）	慶應義塾大学理工学部システムデザイン工学科 Department of System Design Engineering, Faculty of Science and Technology, Keio University
発表年月日	2012-03-08
資料番号	CAS2011-112,SIP2011-132,CS2011-104
巻番号（vol）	vol.111
号番号（no）	465
ページ範囲	pp.-
ページ数	6
発行日