Presentation | 2006/12/15 Speech Recognition with Out-of-Vocabulary Word Processing Using a Variable-Length Sub-Word HMM Shinichi HOMMA, Akio KOBAYASHI, Kazuo ONOE, Shoei SATO, Toru IMAI, Tohru TAKAGI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | General LVCSR has a problem that Out-Of-Vocabulary (OOV) words cannot be recognized because of limitations of registered words. In this paper, we propose a novel approach to recognize every OOV word by using Kana character strings of connected variable-length sub-words. We estimate output probabilities of the sub-word patterns by maximum likelihood estimation applying a general HMM which emits a unit symbol at a time. In order to reduce the number of the sub-words, we select the sub-words based on the MDL criterion and re-estimate their output probabilities. When we perform speech recognition, the HMM for OOV words is used with a language model constructed by using vocabulary words and outputs Kana character strings from the input speech segments including OOV words. In a recognition experiment of a broadcast documentary program dealing with nature, the word error rate of evaluation data including OOV words in each sentence was reduced from 26.7% to 18.4%. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | LVCSR / Language Model / OOV / HMM / MDL Criterion |
Paper # | NLC2006-67,SP2006-123 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2006/12/15(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Speech Recognition with Out-of-Vocabulary Word Processing Using a Variable-Length Sub-Word HMM |
Sub Title (in English) | |
Keyword(1) | LVCSR |
Keyword(2) | Language Model |
Keyword(3) | OOV |
Keyword(4) | HMM |
Keyword(5) | MDL Criterion |
1st Author's Name | Shinichi HOMMA |
1st Author's Affiliation | NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories() |
2nd Author's Name | Akio KOBAYASHI |
2nd Author's Affiliation | NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories |
3rd Author's Name | Kazuo ONOE |
3rd Author's Affiliation | NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories |
4th Author's Name | Shoei SATO |
4th Author's Affiliation | NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories |
5th Author's Name | Toru IMAI |
5th Author's Affiliation | NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories |
6th Author's Name | Tohru TAKAGI |
6th Author's Affiliation | NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories |
Date | 2006/12/15 |
Paper # | NLC2006-67,SP2006-123 |
Volume (vol) | vol.106 |
Number (no) | 442 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |