Presentation 2002/12/12
FINITE-STATE TRANSDUCER BASED PHONOLOGY AND MORPHOLOGY MODELING WITH APPLICATIONS TO HUNGARIAN LVCSR
Mate Szarvas, Sadaoki Furui,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This article introduces a novel approach to model phonology and morphosyntax in morpheme unit based speech recognizers. The proposed method is evaluated in our recent Hungarian large vocabulary continuous speech recognition (LVCSR) system. The architecture of the recognition system is based on the weighted finite state transducer (WFST) paradigm. The task domain is the recognition of fluently read sentences selected from a major daily newspaper. The vocabulary units used in the system are morpheme based in order to provide sufficient coverage of the large number of word-forms resulting from affixation and compounding. Besides the basic pronunciation model and the morpheme N-gram language model we evaluate a novel phonology model and the novel stochastic morphosyntactic language model (SMLM). Thanks to the flexible transducer-based architecture of the system these new components are integrated seamlessly with the basic modules with no need to modify the decoder itself. The proposed phonology model reduced the error rate by 8.32% and the stochastic morphosyntacric language model decreased the error rate by 17.9% relatively compared to the baseline systems. The morpheme error rate of the best configuration is 14.75% in a 1350 morpheme Hungarian dictation task.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) phonology modeling / language modeling / morphology modeling / finite state transducer / speech recognition / Hungarian
Paper # SP2002-144
Date of Issue

Conference Information
Committee SP
Conference Date 2002/12/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) FINITE-STATE TRANSDUCER BASED PHONOLOGY AND MORPHOLOGY MODELING WITH APPLICATIONS TO HUNGARIAN LVCSR
Sub Title (in English)
Keyword(1) phonology modeling
Keyword(2) language modeling
Keyword(3) morphology modeling
Keyword(4) finite state transducer
Keyword(5) speech recognition
Keyword(6) Hungarian
1st Author's Name Mate Szarvas
1st Author's Affiliation Department of Computer Science Tokyo Institute of Technology()
2nd Author's Name Sadaoki Furui
2nd Author's Affiliation Department of Computer Science Tokyo Institute of Technology
Date 2002/12/12
Paper # SP2002-144
Volume (vol) vol.102
Number (no) 529
Page pp.pp.-
#Pages 6
Date of Issue