発話中における単音の音響的品質正規化の検討

GHULAM Muhammad; 福田 隆; 新田 恒雄

講演名	2002/12/12 発話中における単音の音響的品質正規化の検討 GHULAM Muhammad, 福田隆, 新田恒雄,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	本報告では,これまでに提案したHMM-SM方式による音声認識を連続数字音声に適用すると共に,発話中における単音の音響的品質を正規化することの効果を,HMMに基づく様々な方式,すなわち発話全体の正規化,単語単位の正規化,単音単位の正規化,および正規化なしの場合と比較する。提案のHMM-SM方式では,最初にHMM分類器でN-bestの単語候補と,候補単語内の全音素境界を求めた後,SM照合器で単音を構成する特徴ベクトル系列と,全単音の固有ベクトル群間で類似度を計算する。最後に,発話区間における単音の音響的品質の違いを反映させた類似度→尤度変換(正規化処理)を行うと共に,HMM分類器による尤度を併用して発話内容を決定する。連続数字音声を対象に,標準的なHMM方式(発話全体で正規化)と比較した実験では,提案方式は単語正解率で96.3%から98.7%,また単語認識精度で95.7%から98.2%と大幅な向上を示した。また,他の様々な正規化改良を施したHMM方式と比較した際にも,提案方式はこれらの性能を大きく上回る結果を示した。
抄録(英)	In this paper, we expand our previously proposed HMM-SM-based speech recognition system to a connected digit recognition task by exploring the effect of normalizing the acoustic qualities of the monophones in an utterance and compare it with a number of HMM-based systems with utterance-level normalization, word-level normalization, monophone-level normalization and without normalization. In the proposed HMM-SM-based system, an HMM-based classifier classifies the N-best hypotheses (word candidates), and then an SM (Subspace Method)-based verifier tests the hypotheses after applying the monophone score normalization. Experimental results performed on a connected digit recognition task showed that the word correct rate and the word accuracy rate were significantly improved by the proposed method from 96.3% to 98.7% and from 95.7% to 98.2%, respectively, compared with the convenient HMM-based classifier with utterance-level normalization. The proposed method also showed high performance over the other HMM-based systems that we have compared.
キーワード(和)	音声認識 / HMM / 音響品質の正規化 / 部分空間法
キーワード(英)	Speech Recognition / HMM / Normalization of Acoustic Quality / Subspace Method
資料番号	SP2002-122
発行日

研究会情報
研究会	SP
開催期間	2002/12/12(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Speech (SP)
本文の言語	ENG
タイトル（和）	発話中における単音の音響的品質正規化の検討
サブタイトル（和）
タイトル（英）	Normalizing the Acoustic Qualities of Monophones in an Utterance
サブタイトル（和）
キーワード(1)（和/英）	音声認識 / Speech Recognition
キーワード(2)（和/英）	HMM / HMM
キーワード(3)（和/英）	音響品質の正規化 / Normalization of Acoustic Quality
キーワード(4)（和/英）	部分空間法 / Subspace Method
第 1 著者氏名（和/英）	GHULAM Muhammad / Muhammad GHULAM
第 1 著者所属（和/英）	豊橋技術科学大学大学院工学研究科 Graduate School of Engineering, Toyohashi University of Technology
第 2 著者氏名（和/英）	福田隆 / Takashi FUKUDA
第 2 著者所属（和/英）	豊橋技術科学大学大学院工学研究科 Graduate School of Engineering, Toyohashi University of Technology
第 3 著者氏名（和/英）	新田恒雄 / Tsuneo NITTA
第 3 著者所属（和/英）	豊橋技術科学大学大学院工学研究科 Graduate School of Engineering, Toyohashi University of Technology
発表年月日	2002/12/12
資料番号	SP2002-122
巻番号（vol）	vol.102
号番号（no）	529
ページ範囲	pp.-
ページ数	6
発行日