Presentation | 2022-12-01 A Japanese Automatic Speech Recognition System on the Next-Gen Kaldi Framework Wen Shen Teo, Yasuhiro Minami, |
---|---|
PDF Download Page | ![]() |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | 2021 saw the introduction of the cutting-edge successor to the Kaldi speech processing toolkit, known as Next-Gen Kaldi. Leveraging on the Next-Gen Kaldi family of modules in this work, we built a streaming RNN-Transducer Japanese ASR system, trained on the Corpus of Spontaneous Japanese (CSJ). Our E2E model shows a definitive Character Error Rate (CER) improvement over that of Kaldi, but still fall short when compared to state-of-the-art benchmarks from other frameworks enhanced by external language models trained on huge language data. In this paper, we first explain our experiment setups and present our results. Then, in the pursuit of an end-to-end ASR system, we raise several points of discussion where the performance of our ASR model suffered. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Next-Gen Kaldi / Corpus of Spontaneous Japanese / RNN-Transducer / Automatic Speech Recognition |
Paper # | NLC2022-16,SP2022-36 |
Date of Issue | 2022-11-22 (NLC, SP) |
Conference Information | |
Committee | NLC / IPSJ-NL / SP / IPSJ-SLP |
---|---|
Conference Date | 2022/11/29(3days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Mitsuo Yoshida(Univ. of Tsukuba) / 須藤 克仁(奈良先端科学技術大学院大学) / Tomoki Toda(Nagoya Univ.) / 戸田 智基(名古屋大学) |
Vice Chair | Hiroki Sakaji(Univ. of Tokyo) / Takeshi Kobayakawa(NHK) |
Secretary | Hiroki Sakaji(NTT) / Takeshi Kobayakawa(Hiroshima Univ. of Economics) / (株式会社デンソーアイティーラボラトリ) / (北海学園大学) / (東京農工大学) |
Assistant | Kanjin Takahashi(Sansan) / Yasuhiro Ogawa(Nagoya Univ.) / / Ryo Aihara(Mitsubishi Electric) / Daisuke Saito(Univ. of Tokyo) |
Paper Information | |
Registration To | Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Natural Language / Technical Committee on Speech / Special Interest Group on Spoken Language Processing |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Japanese Automatic Speech Recognition System on the Next-Gen Kaldi Framework |
Sub Title (in English) | |
Keyword(1) | Next-Gen Kaldi |
Keyword(2) | Corpus of Spontaneous Japanese |
Keyword(3) | RNN-Transducer |
Keyword(4) | Automatic Speech Recognition |
1st Author's Name | Wen Shen Teo |
1st Author's Affiliation | The University of Electro-Communications(UEC) |
2nd Author's Name | Yasuhiro Minami |
2nd Author's Affiliation | The University of Electro-Communications(UEC) |
Date | 2022-12-01 |
Paper # | NLC2022-16,SP2022-36 |
Volume (vol) | vol.122 |
Number (no) | NLC-287,SP-288 |
Page | pp.pp.39-44(NLC), pp.39-44(SP), |
#Pages | 6 |
Date of Issue | 2022-11-22 (NLC, SP) |