Presentation 2020-06-04
An experimental comparison of CNN- and CRNN-CTC for automatic phrase speech recognition systems using a children's speech database
Yunzhe Wang, Yu Tian, Yoshikazu Miyanaga, Hiroshi Tsutsui,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Children's speech recognition is still a challenging issue. In the case of children's speeches, the accuracy of conventional phrase speech recognition approaches is significantly low. This is mainly owing to the high variability of pronunciation patterns due to children's physical activity. Motivated by this, in this paper, we present a phrase speech recognition system using neural networks. We use a convolutional neural network (CNNs) and its recurrent neural network (RNN) version, say CRNN. Also, both approaches utilize a connectionist temporal classification (CTC) loss function, which allows networks to be trained without any prior alignment. Through experiments using a children's speech database, we show the comparison results of CNN- and CRNN-CTC approaches.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Children's speech recognitionconvolutional recurrent neural network (CRNN)connectionist temporal classification (CTC)
Paper # SIS2020-9
Date of Issue 2020-05-27 (SIS)

Conference Information
Committee SIS / IPSJ-AVM / ITE-3DIT
Conference Date 2020/6/3(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English) Intelligent Multimedia Systems, Applied Enbedded Systems, Three-Dimensional Image Technology (3DIT), etc.
Chair Noriaki Suetake(Yamaguchi Univ.) / Sei Naito(KDDI Research, Inc.) / Shiro Suyama(Tokushima Univ.)
Vice Chair Tomoaki Kimura(Kanagawa Inst. of Tech.) / Naoto Sasaoka(Tottori Univ.)
Secretary Tomoaki Kimura(Kindai Univ.) / Naoto Sasaoka(National Inst. of Tech., Ube College) / (NTT) / (Waseda Univ.)
Assistant Yukihiro Bandoh(NTT) / Soh Yoshida(Kansai Univ.)

Paper Information
Registration To Technical Committee on Smart Info-Media Systems / Special Interest Group on Audio Visual and Multimedia Information Processing / Technical Group on Three-Dimensional Image Technology
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) An experimental comparison of CNN- and CRNN-CTC for automatic phrase speech recognition systems using a children's speech database
Sub Title (in English)
Keyword(1) Children's speech recognitionconvolutional recurrent neural network (CRNN)connectionist temporal classification (CTC)
1st Author's Name Yunzhe Wang
1st Author's Affiliation Hokkaido University(Hokkaido Univ.)
2nd Author's Name Yu Tian
2nd Author's Affiliation Hokkaido University(Hokkaido Univ.)
3rd Author's Name Yoshikazu Miyanaga
3rd Author's Affiliation Chitose Institute of Science and Technology(CIST)
4th Author's Name Hiroshi Tsutsui
4th Author's Affiliation Hokkaido University(Hokkaido Univ.)
Date 2020-06-04
Paper # SIS2020-9
Volume (vol) vol.120
Number (no) SIS-51
Page pp.pp.49-54(SIS),
#Pages 6
Date of Issue 2020-05-27 (SIS)