Presentation | 2009-12-21 Recent Evaluations of a WFST-Based Speech Recognition Decoder Paul R. DIXON, Josef R. NOVAK, Tasuku OONISHI, Sadaoki FURUI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes the latest performance evaluations on the Tokyo Tech Transducer-based(T^3)speech decoder. These evaluations focus on two particular tasks which include a large-vocabulary continuous speech transcription system with a 460k vocabulary evaluated on the JNAS corpus, and a voice search system developed for an all-Japan train timetables task. This paper provides a detailed explanation of the successful steps taken to construct a large integrated network which achieves high recognition performance, based on an exhaustive comparison of different construction strategies. Furthermore, in the context of the voice search task, this paper provides a performance comparison of two widely popular acoustic model toolkits, HTK and SphinxTrain in the unified context of the T^3 decoder. In particular these results indicate that there is a significant advantage to employing the log semiring for all WFST construction operations. These results also serve to further verify the flexibility and speed of the T^3 decoder on a variety of different tasks. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Speech Recognition / WFST / LVCSR |
Paper # | NLC2009-14,SP2009-78 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2009/12/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Recent Evaluations of a WFST-Based Speech Recognition Decoder |
Sub Title (in English) | |
Keyword(1) | Speech Recognition |
Keyword(2) | WFST |
Keyword(3) | LVCSR |
1st Author's Name | Paul R. DIXON |
1st Author's Affiliation | Tokyo Institute of Technology, Department of Computer Science() |
2nd Author's Name | Josef R. NOVAK |
2nd Author's Affiliation | Tokyo Institute of Technology, Department of Computer Science |
3rd Author's Name | Tasuku OONISHI |
3rd Author's Affiliation | Tokyo Institute of Technology, Department of Computer Science |
4th Author's Name | Sadaoki FURUI |
4th Author's Affiliation | Tokyo Institute of Technology, Department of Computer Science |
Date | 2009-12-21 |
Paper # | NLC2009-14,SP2009-78 |
Volume (vol) | vol.109 |
Number (no) | 355 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |