講演名 | 2016-03-28 [Poster Presentation] An evaluation of acoustic-to-articulatory inversion mapping with latent trajectory Gaussian mixture model Patrick Lumban Tobing(奈良先端大), Tomoki Toda(名大/奈良先端大), Hirokazu Kameoka(NTT), Satoshi Nakamura(奈良先端大), |
---|---|
PDFダウンロードページ | PDFダウンロードページへ |
抄録(和) | In this report, we present an evaluation of acoustic-to-articulatory inversion mapping based on latent trajectoryGaussian mixture model (LTGMM). In a conventional GMM-based inversion mapping system, GMM parametersare optimized by maximizing the likelihood of joint static and dynamic features of acoustic-articulatory data. In the mapping process, given the acoustic data, smoothly varyingarticulatory parameter trajectories are estimated by maximizing theconditional likelihood of their static features only, where theinter-frame correlation is taken into account by imposing the explicitrelationship between static and dynamic features. Because training and optimization criteria are different from each other, the trained GMM is not optimum for the mapping process. A trajectory training method has been proposed to address this inconsistency problem [1]. However, this method has difficulties in optimization of some parameters, such as covariance matrices and a mixture component sequence. In this report, as another method to address the inconsistency problem, we propose an inversion mapping method based on latent trajectory GMM, inspired by the latent trjectory hidden Markov model [2]. The proposedmethod makes it possible to apply EM algorithm to model parameteroptimization, which is difficult in the conventional trajectory trainingmethod. The experimental results demonstrate that the proposed LTGMM methodoutperforms the conventional GMM for the acoustic-to-articulatory inversion mapping task with lower valuesof root-mean-square error and higher values of correlation coefficient. |
抄録(英) | In this report, we present an evaluation of acoustic-to-articulatory inversion mapping based on latent trajectoryGaussian mixture model (LTGMM). In a conventional GMM-based inversion mapping system, GMM parametersare optimized by maximizing the likelihood of joint static and dynamic features of acoustic-articulatory data. In the mapping process, given the acoustic data, smoothly varyingarticulatory parameter trajectories are estimated by maximizing theconditional likelihood of their static features only, where theinter-frame correlation is taken into account by imposing the explicitrelationship between static and dynamic features. Because training and optimization criteria are different from each other, the trained GMM is not optimum for the mapping process. A trajectory training method has been proposed to address this inconsistency problem [1]. However, this method has difficulties in optimization of some parameters, such as covariance matrices and a mixture component sequence. In this report, as another method to address the inconsistency problem, we propose an inversion mapping method based on latent trajectory GMM, inspired by the latent trjectory hidden Markov model [2]. The proposedmethod makes it possible to apply EM algorithm to model parameteroptimization, which is difficult in the conventional trajectory trainingmethod. The experimental results demonstrate that the proposed LTGMM methodoutperforms the conventional GMM for the acoustic-to-articulatory inversion mapping task with lower valuesof root-mean-square error and higher values of correlation coefficient. |
キーワード(和) | acoustic-to-articulatory inversion mapping / Gaussian mixture model / trajectory training / inter-frame correlation / EM algorithm |
キーワード(英) | acoustic-to-articulatory inversion mapping / Gaussian mixture model / trajectory training / inter-frame correlation / EM algorithm |
資料番号 | EA2015-85,SIP2015-134,SP2015-113 |
発行日 | 2016-03-21 (EA, SIP, SP) |
研究会情報 | |
研究会 | EA / SP / SIP |
---|---|
開催期間 | 2016/3/28(から2日開催) |
開催地(和) | 別府国際コンベンションセンター B-ConPlaza |
開催地(英) | Beppu International Convention Center B-ConPlaza |
テーマ(和) | 応用/電気音響,音声,信号処理,一般 |
テーマ(英) | Engineering/Electro Acoustics, Speech, Signal Processing, and Related Topics |
委員長氏名(和) | 羽田 陽一(電通大) / 間野 一則(芝浦工大) / 宝珠山 治(NEC) |
委員長氏名(英) | Yoichi Haneda(Univ. of Electro-Comm.) / Kazunori Mano(Shibaura Inst. of Tech.) / Osamu Houshuyama(NEC) |
副委員長氏名(和) | 岩谷 幸雄(東北学院大) / 水町 光徳(九工大) / 北岡 教英(徳島大) / 中静 真(千葉工大) / 奥田 正浩(北九州市大) |
副委員長氏名(英) | Yukio Iwaya(Tohoku Gakuin Univ.) / Mitsunori Mizumachi(Kyushu Inst. of Tech.) / Norihide Kitaoka(Tokushima Univ.) / Makoto Nakashizuka(Chiba Inst. of Tech.) / Masahiro Okuda(Univ. of Kitakyushu) |
幹事氏名(和) | 島内 末廣(NTT) / 堀内 俊治(KDDI研) / 岩野 公司(東京都市大) / 滝口 哲也(神戸大) / 辻川 剛範(NEC) / 平林 晃(立命館大) |
幹事氏名(英) | Suehiro Shimauchi(NTT) / Toshiharu Horiuchi(KDDI R&D Labs.) / Koji Iwano(Tokyo City Univ.) / Tetsuya Takiguchi(Kobe Univ.) / Masanori Tsujikawa(NEC) / Akira Hirabayashi(Ritsumeikan Univ.) |
幹事補佐氏名(和) | 小山 翔一(東大) / 能勢 隆(東北大) / 浅見 太一(NTT) / 宮田 高道(千葉工大) |
幹事補佐氏名(英) | Shoichi Koyama(Univ. of Tokyo) / Takashi Nose(Tohoku Univ.) / Taichi Asami(NTT) / Takamichi Miyata(Chiba Inst. of Tech.) |
講演論文情報詳細 | |
申込み研究会 | Technical Committee on Engineering Acoustics / Technical Committee on Speech / Technical Committee on Signal Processing |
---|---|
本文の言語 | ENG |
タイトル(和) | |
サブタイトル(和) | |
タイトル(英) | [Poster Presentation] An evaluation of acoustic-to-articulatory inversion mapping with latent trajectory Gaussian mixture model |
サブタイトル(和) | |
キーワード(1)(和/英) | acoustic-to-articulatory inversion mapping / acoustic-to-articulatory inversion mapping |
キーワード(2)(和/英) | Gaussian mixture model / Gaussian mixture model |
キーワード(3)(和/英) | trajectory training / trajectory training |
キーワード(4)(和/英) | inter-frame correlation / inter-frame correlation |
キーワード(5)(和/英) | EM algorithm / EM algorithm |
第 1 著者 氏名(和/英) | Patrick Lumban Tobing / Patrick Lumban Tobing |
第 1 著者 所属(和/英) | Nara Institute of Science and Technology(略称:奈良先端大) Nara Institute of Science and Technology(略称:NAIST) |
第 2 著者 氏名(和/英) | Tomoki Toda / Tomoki Toda |
第 2 著者 所属(和/英) | Nagoya University/Nara Institute of Science and Technology(略称:名大/奈良先端大) Nagoya University/Nara Institute of Science and Technology(略称:Nagoya Univ./NAIST) |
第 3 著者 氏名(和/英) | Hirokazu Kameoka / Hirokazu Kameoka |
第 3 著者 所属(和/英) | Nippon Telegraph and Telephone Corporation(略称:NTT) Nippon Telegraph and Telephone Corporation(略称:NTT) |
第 4 著者 氏名(和/英) | Satoshi Nakamura / Satoshi Nakamura |
第 4 著者 所属(和/英) | Nara Institute of Science and Technology(略称:奈良先端大) Nara Institute of Science and Technology(略称:NAIST) |
発表年月日 | 2016-03-28 |
資料番号 | EA2015-85,SIP2015-134,SP2015-113 |
巻番号(vol) | vol.115 |
号番号(no) | EA-521,SIP-522,SP-523 |
ページ範囲 | pp.111-116(EA), pp.111-116(SIP), pp.111-116(SP), |
ページ数 | 6 |
発行日 | 2016-03-21 (EA, SIP, SP) |