Presentation | 2015-06-18 Phone Labeling Based on Gaussian Mixture Model for Dysarthric Speech Recognition Yuki Takashima, Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We investigate in this paper speech recognition for a person with an articulation disorder resulting from athetoid cerebral palsy. As our previous work, the feature extraction method using a convolutional neural network is proposed, and showed its effectiveness. The neural network needs the teaching signal to train the network using back-propagation, and the previous method uses forced alignment using HMMs from speech data for the teaching signal. However, because the dysarthric speech fluctuates every utterance, it is difficult to obtain the correct alignment. It is considered that the network is not adequately trained due to the wrong alignment. However, phone boundaries for dysarthric speech are ambiguous, and it is difficult to give the correct alignment and it is difficult to give the correct alignment. Therefore, we propose a phone labeling method using the Gaussian distribution. In this paper, we report our experimental results of speech recognition using the networks trained by the phone alignments calculated by our proposed method. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | articulation disorders / feature extraction / convolutional neural network / bottleneck feature / phoneme labeling |
Paper # | PRMU2015-44,SP2015-13,WIT2015-13 |
Date of Issue | 2015-06-11 (PRMU, SP, WIT) |
Conference Information | |
Committee | WIT / SP / ASJ-H / PRMU |
---|---|
Conference Date | 2015/6/18(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Kiyohiko Nunokawa(Tokyo International Univ.) / Kazunori Mano(Shibaura Inst. of Tech.) / Masato Akagi(北陸先端大) / Eisaku Maeda(NTT) |
Vice Chair | Chikamune Wada(Kyushu Inst. of Tech.) / Norihide Kitaoka(Tokushima Univ.) / Shigeto Furukawa(NTT) / Shuji Senda(NEC) / Seiichi Uchida(Kyushu Univ.) |
Secretary | Chikamune Wada(Nagoya Inst. of Tech.) / Norihide Kitaoka(AIST) / Shigeto Furukawa(Tsukuba Univ. of Tech.) / Shuji Senda(Tokyo City Univ.) / Seiichi Uchida(Kobe Univ.) |
Assistant | Tomohiro Amemiya(NTT) / Takeaki Shionome(Tsukuba Univ. of Tech.) / Manabi Miyagi(Tsukuba Univ. of Tech.) / Takashi Nose(Tohoku Univ.) / Taichi Asami(NTT) / / Kazuaki Kondo(Kyoto Univ.) / Akisato Kimura(NTT) |
Paper Information | |
Registration To | Technical Committee on Well-being Information Technology / Technical Committee on Speech / * / Technical Committee on Pattern Recognition and Media Understanding |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Phone Labeling Based on Gaussian Mixture Model for Dysarthric Speech Recognition |
Sub Title (in English) | |
Keyword(1) | articulation disorders |
Keyword(2) | feature extraction |
Keyword(3) | convolutional neural network |
Keyword(4) | bottleneck feature |
Keyword(5) | phoneme labeling |
1st Author's Name | Yuki Takashima |
1st Author's Affiliation | Kobe University(Kobe Univ.) |
2nd Author's Name | Toru Nakashika |
2nd Author's Affiliation | The University of Electro-Communications(UEC) |
3rd Author's Name | Tetsuya Takiguchi |
3rd Author's Affiliation | Kobe University(Kobe Univ.) |
4th Author's Name | Yasuo Ariki |
4th Author's Affiliation | Kobe University(Kobe Univ.) |
Date | 2015-06-18 |
Paper # | PRMU2015-44,SP2015-13,WIT2015-13 |
Volume (vol) | vol.115 |
Number (no) | PRMU-98,SP-99,WIT-100 |
Page | pp.pp.71-76(PRMU), pp.71-76(SP), pp.71-76(WIT), |
#Pages | 6 |
Date of Issue | 2015-06-11 (PRMU, SP, WIT) |