Presentation | 2004/12/15 Efficient Generation of high-order context-dependent Weighted Finite State Transducers for Speech Recognition Mike SCHUSTER, Takaaki HORI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes an algorithm for efficient building of Weighted Finite State Transducers for speech recognition when high-order context-dependent models of order K > 3 (triphones) with tied states are used. After discussing some inefficiencies of the standard compilation method which make the use of high-order context-dependent models cumbersome and sometimes even impossible because of memory constraints, we show how an algorithm to build a part of the needed composed transducers directly from the decision trees in combination with an improved compilation process can lead to much faster, simpler and more memory-efficient compilation. In our case it also resulted in substantially smaller final networks. With the described algorithm it is simple to use high-order full cross-word models with little overhead directly within a one-pass time-synchronous search, which we test comparing resulting final network sizes, recognition rates and speed on a large, spontaneous Japanese speech database. Using the proposed algorithm it is possible to do real-time recognition using full cross-word quinphones with a large acoustic model in about 125MB of memory at about 9% search error. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Speech recognition / search / weighted finite state transducers |
Paper # | NLC2004-83,SP2004-123 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2004/12/15(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Efficient Generation of high-order context-dependent Weighted Finite State Transducers for Speech Recognition |
Sub Title (in English) | |
Keyword(1) | Speech recognition |
Keyword(2) | search |
Keyword(3) | weighted finite state transducers |
1st Author's Name | Mike SCHUSTER |
1st Author's Affiliation | Nippon Telegraph and Telephone Corporation, NTT Communication Science Laboratories() |
2nd Author's Name | Takaaki HORI |
2nd Author's Affiliation | Nippon Telegraph and Telephone Corporation, NTT Communication Science Laboratories |
Date | 2004/12/15 |
Paper # | NLC2004-83,SP2004-123 |
Volume (vol) | vol.104 |
Number (no) | 540 |
Page | pp.pp.- |
#Pages | 5 |
Date of Issue |