Presentation 2015-06-18
Phone Labeling Based on Gaussian Mixture Model for Dysarthric Speech Recognition
Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We investigate in this paper speech recognition for a person with an articulation disorder resulting from athetoid cerebral palsy. As our previous work, the feature extraction method using a convolutional neural network is proposed, and showed its effectiveness. The neural network needs the teaching signal to train the network using back-propagation, and the previous method uses forced alignment using HMMs from speech data for the teaching signal. However, because the dysarthric speech fluctuates every utterance, it is difficult to obtain the correct alignment. It is considered that the network is not adequately trained due to the wrong alignment. However, phone boundaries for dysarthric speech are ambiguous, and it is difficult to give the correct alignment and it is difficult to give the correct alignment. Therefore, we propose a phone labeling method using the Gaussian distribution. In this paper, we report our experimental results of speech recognition using the networks trained by the phone alignments calculated by our proposed method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) articulation disorders / feature extraction / convolutional neural network / bottleneck feature / phoneme labeling
Paper # PRMU2015-44,SP2015-13,WIT2015-13
Date of Issue 2015-06-11 (PRMU, SP, WIT)

Conference Information
Committee WIT / SP / ASJ-H / PRMU
Conference Date 2015/6/18(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Kiyohiko Nunokawa(Tokyo International Univ.) / Kazunori Mano(Shibaura Inst. of Tech.) / Masato Akagi(北陸先端大) / Eisaku Maeda(NTT)
Vice Chair Chikamune Wada(Kyushu Inst. of Tech.) / Norihide Kitaoka(Tokushima Univ.) / Shigeto Furukawa(NTT) / Shuji Senda(NEC) / Seiichi Uchida(Kyushu Univ.)
Secretary Chikamune Wada(Nagoya Inst. of Tech.) / Norihide Kitaoka(AIST) / Shigeto Furukawa(Tsukuba Univ. of Tech.) / Shuji Senda(Tokyo City Univ.) / Seiichi Uchida(Kobe Univ.)
Assistant Tomohiro Amemiya(NTT) / Takeaki Shionome(Tsukuba Univ. of Tech.) / Manabi Miyagi(Tsukuba Univ. of Tech.) / Takashi Nose(Tohoku Univ.) / Taichi Asami(NTT) / / Kazuaki Kondo(Kyoto Univ.) / Akisato Kimura(NTT)

Paper Information
Registration To Technical Committee on Well-being Information Technology / Technical Committee on Speech / * / Technical Committee on Pattern Recognition and Media Understanding
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Phone Labeling Based on Gaussian Mixture Model for Dysarthric Speech Recognition
Sub Title (in English)
Keyword(1) articulation disorders
Keyword(2) feature extraction
Keyword(3) convolutional neural network
Keyword(4) bottleneck feature
Keyword(5) phoneme labeling
1st Author's Name Yuki Takashima
1st Author's Affiliation Kobe University(Kobe Univ.)
2nd Author's Name Toru Nakashika
2nd Author's Affiliation The University of Electro-Communications(UEC)
3rd Author's Name Tetsuya Takiguchi
3rd Author's Affiliation Kobe University(Kobe Univ.)
4th Author's Name Yasuo Ariki
4th Author's Affiliation Kobe University(Kobe Univ.)
Date 2015-06-18
Paper # PRMU2015-44,SP2015-13,WIT2015-13
Volume (vol) vol.115
Number (no) PRMU-98,SP-99,WIT-100
Page pp.pp.71-76(PRMU), pp.71-76(SP), pp.71-76(WIT),
#Pages 6
Date of Issue 2015-06-11 (PRMU, SP, WIT)