Presentation 2006/12/15
Speech Recognition with Out-of-Vocabulary Word Processing Using a Variable-Length Sub-Word HMM
Shinichi HOMMA, Akio KOBAYASHI, Kazuo ONOE, Shoei SATO, Toru IMAI, Tohru TAKAGI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) General LVCSR has a problem that Out-Of-Vocabulary (OOV) words cannot be recognized because of limitations of registered words. In this paper, we propose a novel approach to recognize every OOV word by using Kana character strings of connected variable-length sub-words. We estimate output probabilities of the sub-word patterns by maximum likelihood estimation applying a general HMM which emits a unit symbol at a time. In order to reduce the number of the sub-words, we select the sub-words based on the MDL criterion and re-estimate their output probabilities. When we perform speech recognition, the HMM for OOV words is used with a language model constructed by using vocabulary words and outputs Kana character strings from the input speech segments including OOV words. In a recognition experiment of a broadcast documentary program dealing with nature, the word error rate of evaluation data including OOV words in each sentence was reduced from 26.7% to 18.4%.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) LVCSR / Language Model / OOV / HMM / MDL Criterion
Paper # NLC2006-67,SP2006-123
Date of Issue

Conference Information
Committee NLC
Conference Date 2006/12/15(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Speech Recognition with Out-of-Vocabulary Word Processing Using a Variable-Length Sub-Word HMM
Sub Title (in English)
Keyword(1) LVCSR
Keyword(2) Language Model
Keyword(3) OOV
Keyword(4) HMM
Keyword(5) MDL Criterion
1st Author's Name Shinichi HOMMA
1st Author's Affiliation NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories()
2nd Author's Name Akio KOBAYASHI
2nd Author's Affiliation NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories
3rd Author's Name Kazuo ONOE
3rd Author's Affiliation NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories
4th Author's Name Shoei SATO
4th Author's Affiliation NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories
5th Author's Name Toru IMAI
5th Author's Affiliation NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories
6th Author's Name Tohru TAKAGI
6th Author's Affiliation NHK (Japan Broadcasting Corporation) Science and Technical Research Laboratories
Date 2006/12/15
Paper # NLC2006-67,SP2006-123
Volume (vol) vol.106
Number (no) 442
Page pp.pp.-
#Pages 6
Date of Issue