Presentation | 2020-06-04 An experimental comparison of CNN- and CRNN-CTC for automatic phrase speech recognition systems using a children's speech database Yunzhe Wang, Yu Tian, Yoshikazu Miyanaga, Hiroshi Tsutsui, |
---|---|
PDF Download Page | ![]() |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Children's speech recognition is still a challenging issue. In the case of children's speeches, the accuracy of conventional phrase speech recognition approaches is significantly low. This is mainly owing to the high variability of pronunciation patterns due to children's physical activity. Motivated by this, in this paper, we present a phrase speech recognition system using neural networks. We use a convolutional neural network (CNNs) and its recurrent neural network (RNN) version, say CRNN. Also, both approaches utilize a connectionist temporal classification (CTC) loss function, which allows networks to be trained without any prior alignment. Through experiments using a children's speech database, we show the comparison results of CNN- and CRNN-CTC approaches. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Children's speech recognitionconvolutional recurrent neural network (CRNN)connectionist temporal classification (CTC) |
Paper # | SIS2020-9 |
Date of Issue | 2020-05-27 (SIS) |
Conference Information | |
Committee | SIS / IPSJ-AVM / ITE-3DIT |
---|---|
Conference Date | 2020/6/3(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Online |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Intelligent Multimedia Systems, Applied Enbedded Systems, Three-Dimensional Image Technology (3DIT), etc. |
Chair | Noriaki Suetake(Yamaguchi Univ.) / Sei Naito(KDDI Research, Inc.) / Shiro Suyama(Tokushima Univ.) |
Vice Chair | Tomoaki Kimura(Kanagawa Inst. of Tech.) / Naoto Sasaoka(Tottori Univ.) |
Secretary | Tomoaki Kimura(Kindai Univ.) / Naoto Sasaoka(National Inst. of Tech., Ube College) / (NTT) / (Waseda Univ.) |
Assistant | Yukihiro Bandoh(NTT) / Soh Yoshida(Kansai Univ.) |
Paper Information | |
Registration To | Technical Committee on Smart Info-Media Systems / Special Interest Group on Audio Visual and Multimedia Information Processing / Technical Group on Three-Dimensional Image Technology |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | An experimental comparison of CNN- and CRNN-CTC for automatic phrase speech recognition systems using a children's speech database |
Sub Title (in English) | |
Keyword(1) | Children's speech recognitionconvolutional recurrent neural network (CRNN)connectionist temporal classification (CTC) |
1st Author's Name | Yunzhe Wang |
1st Author's Affiliation | Hokkaido University(Hokkaido Univ.) |
2nd Author's Name | Yu Tian |
2nd Author's Affiliation | Hokkaido University(Hokkaido Univ.) |
3rd Author's Name | Yoshikazu Miyanaga |
3rd Author's Affiliation | Chitose Institute of Science and Technology(CIST) |
4th Author's Name | Hiroshi Tsutsui |
4th Author's Affiliation | Hokkaido University(Hokkaido Univ.) |
Date | 2020-06-04 |
Paper # | SIS2020-9 |
Volume (vol) | vol.120 |
Number (no) | SIS-51 |
Page | pp.pp.49-54(SIS), |
#Pages | 6 |
Date of Issue | 2020-05-27 (SIS) |