[Poster Presentation] An evaluation of acoustic-to-articulatory inversion mapping with latent trajectory Gaussian mixture model

Patrick Lumban Tobing; Tomoki Toda; Hirokazu Kameoka; Satoshi Nakamura

講演名	2016-03-28 [Poster Presentation] An evaluation of acoustic-to-articulatory inversion mapping with latent trajectory Gaussian mixture model Patrick Lumban Tobing(奈良先端大), Tomoki Toda(名大/奈良先端大), Hirokazu Kameoka(NTT), Satoshi Nakamura(奈良先端大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	In this report, we present an evaluation of acoustic-to-articulatory inversion mapping based on latent trajectoryGaussian mixture model (LTGMM). In a conventional GMM-based inversion mapping system, GMM parametersare optimized by maximizing the likelihood of joint static and dynamic features of acoustic-articulatory data. In the mapping process, given the acoustic data, smoothly varyingarticulatory parameter trajectories are estimated by maximizing theconditional likelihood of their static features only, where theinter-frame correlation is taken into account by imposing the explicitrelationship between static and dynamic features. Because training and optimization criteria are different from each other, the trained GMM is not optimum for the mapping process. A trajectory training method has been proposed to address this inconsistency problem [1]. However, this method has difficulties in optimization of some parameters, such as covariance matrices and a mixture component sequence. In this report, as another method to address the inconsistency problem, we propose an inversion mapping method based on latent trajectory GMM, inspired by the latent trjectory hidden Markov model [2]. The proposedmethod makes it possible to apply EM algorithm to model parameteroptimization, which is difficult in the conventional trajectory trainingmethod. The experimental results demonstrate that the proposed LTGMM methodoutperforms the conventional GMM for the acoustic-to-articulatory inversion mapping task with lower valuesof root-mean-square error and higher values of correlation coefficient.
抄録(英)	In this report, we present an evaluation of acoustic-to-articulatory inversion mapping based on latent trajectoryGaussian mixture model (LTGMM). In a conventional GMM-based inversion mapping system, GMM parametersare optimized by maximizing the likelihood of joint static and dynamic features of acoustic-articulatory data. In the mapping process, given the acoustic data, smoothly varyingarticulatory parameter trajectories are estimated by maximizing theconditional likelihood of their static features only, where theinter-frame correlation is taken into account by imposing the explicitrelationship between static and dynamic features. Because training and optimization criteria are different from each other, the trained GMM is not optimum for the mapping process. A trajectory training method has been proposed to address this inconsistency problem [1]. However, this method has difficulties in optimization of some parameters, such as covariance matrices and a mixture component sequence. In this report, as another method to address the inconsistency problem, we propose an inversion mapping method based on latent trajectory GMM, inspired by the latent trjectory hidden Markov model [2]. The proposedmethod makes it possible to apply EM algorithm to model parameteroptimization, which is difficult in the conventional trajectory trainingmethod. The experimental results demonstrate that the proposed LTGMM methodoutperforms the conventional GMM for the acoustic-to-articulatory inversion mapping task with lower valuesof root-mean-square error and higher values of correlation coefficient.
キーワード(和)	acoustic-to-articulatory inversion mapping / Gaussian mixture model / trajectory training / inter-frame correlation / EM algorithm
キーワード(英)	acoustic-to-articulatory inversion mapping / Gaussian mixture model / trajectory training / inter-frame correlation / EM algorithm
資料番号	EA2015-85,SIP2015-134,SP2015-113
発行日	2016-03-21 (EA, SIP, SP)

研究会情報
研究会	EA / SP / SIP
開催期間	2016/3/28(から2日開催)
開催地（和）	別府国際コンベンションセンター B-ConPlaza
開催地（英）	Beppu International Convention Center B-ConPlaza
テーマ（和）	応用／電気音響，音声，信号処理，一般
テーマ（英）	Engineering/Electro Acoustics, Speech, Signal Processing, and Related Topics
委員長氏名（和）	羽田陽一(電通大) / 間野一則(芝浦工大) / 宝珠山治(NEC)
委員長氏名（英）	Yoichi Haneda(Univ. of Electro-Comm.) / Kazunori Mano(Shibaura Inst. of Tech.) / Osamu Houshuyama(NEC)
副委員長氏名（和）	岩谷幸雄(東北学院大) / 水町光徳(九工大) / 北岡教英(徳島大) / 中静真(千葉工大) / 奥田正浩(北九州市大)
副委員長氏名（英）	Yukio Iwaya(Tohoku Gakuin Univ.) / Mitsunori Mizumachi(Kyushu Inst. of Tech.) / Norihide Kitaoka(Tokushima Univ.) / Makoto Nakashizuka(Chiba Inst. of Tech.) / Masahiro Okuda(Univ. of Kitakyushu)
幹事氏名（和）	島内末廣(NTT) / 堀内俊治(KDDI研) / 岩野公司(東京都市大) / 滝口哲也(神戸大) / 辻川剛範(NEC) / 平林晃(立命館大)
幹事氏名（英）	Suehiro Shimauchi(NTT) / Toshiharu Horiuchi(KDDI R&D Labs.) / Koji Iwano(Tokyo City Univ.) / Tetsuya Takiguchi(Kobe Univ.) / Masanori Tsujikawa(NEC) / Akira Hirabayashi(Ritsumeikan Univ.)
幹事補佐氏名（和）	小山翔一(東大) / 能勢隆(東北大) / 浅見太一(NTT) / 宮田高道(千葉工大)
幹事補佐氏名（英）	Shoichi Koyama(Univ. of Tokyo) / Takashi Nose(Tohoku Univ.) / Taichi Asami(NTT) / Takamichi Miyata(Chiba Inst. of Tech.)

講演論文情報詳細
申込み研究会	Technical Committee on Engineering Acoustics / Technical Committee on Speech / Technical Committee on Signal Processing
本文の言語	ENG
タイトル（和）
サブタイトル（和）
タイトル（英）	[Poster Presentation] An evaluation of acoustic-to-articulatory inversion mapping with latent trajectory Gaussian mixture model
サブタイトル（和）
キーワード(1)（和/英）	acoustic-to-articulatory inversion mapping / acoustic-to-articulatory inversion mapping
キーワード(2)（和/英）	Gaussian mixture model / Gaussian mixture model
キーワード(3)（和/英）	trajectory training / trajectory training
キーワード(4)（和/英）	inter-frame correlation / inter-frame correlation
キーワード(5)（和/英）	EM algorithm / EM algorithm
第 1 著者氏名（和/英）	Patrick Lumban Tobing / Patrick Lumban Tobing
第 1 著者所属（和/英）	Nara Institute of Science and Technology(略称：奈良先端大) Nara Institute of Science and Technology(略称：NAIST)
第 2 著者氏名（和/英）	Tomoki Toda / Tomoki Toda
第 2 著者所属（和/英）	Nagoya University/Nara Institute of Science and Technology(略称：名大/奈良先端大) Nagoya University/Nara Institute of Science and Technology(略称：Nagoya Univ./NAIST)
第 3 著者氏名（和/英）	Hirokazu Kameoka / Hirokazu Kameoka
第 3 著者所属（和/英）	Nippon Telegraph and Telephone Corporation(略称：NTT) Nippon Telegraph and Telephone Corporation(略称：NTT)
第 4 著者氏名（和/英）	Satoshi Nakamura / Satoshi Nakamura
第 4 著者所属（和/英）	Nara Institute of Science and Technology(略称：奈良先端大) Nara Institute of Science and Technology(略称：NAIST)
発表年月日	2016-03-28
資料番号	EA2015-85,SIP2015-134,SP2015-113
巻番号（vol）	vol.115
号番号（no）	EA-521,SIP-522,SP-523
ページ範囲	pp.111-116(EA), pp.111-116(SIP), pp.111-116(SP),
ページ数	6
発行日	2016-03-21 (EA, SIP, SP)