話者正規化スペクトルサブバンドパラメータを用いた雑音下での音声認識

柘植 覚; 深田 俊明; シンガー ハラルド

講演名	1998/12/10 話者正規化スペクトルサブバンドパラメータを用いた雑音下での音声認識柘植覚, 深田俊明, シンガーハラルド,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	本稿では、雑音下での音声認識における補助的特徴量として、話者正規化SSC(spectral subband centroids)を提案する。SSCは、サブバンド内に含まれる音声パワースペクトルのセントロイド周波数として定義される。この特徴量は、雑音環境下においても比較的変動の少ない、スペクトルのピーク(フォルマント)が示す周波数をおおまかにとらえるため、雑音に対してロバストな特徴量であると考えられる。SSCはスペクトルのピークが示す周波数に依存する特徴量のため、スペクトル形状の異なる複数話者から求めたSSCの分布は広がり、異なる音素の分布間に大きな重なりが生じると考えられる。そこで、この分布の重なりを低減するため、話者正規化手法をSSCの計算に取り入れた話者正規化SSCを提案する。自由発話音声を用いた連続音声認識実験により、話者正規化SSCを補助的特徴量として用いた場合、20.3%(SNR=15dB)の誤り改善率を得ることができた。また、話者正規化手法を用いないSSCとの比較においても、14.3%(SNR=15dB)の誤り改善率を得ることができた。
抄録(英)	This paper proposes speaker normalized spectral subband centroids(SSCs)as supplementary features in noise environment speech recognition. SSCs are computed as frequency centroids for each subband from the power spectrum of the speech signal. This feature can be obtained reliably even under noisy conditions because SSC are mainly computed from spectral peaks such as formants whose positions are almost unchanged in a noisy environment. Since the conventional SSCs depend on formant frequencies of a speaker, the distributions of SSCs computed from large amounts of speakers will be highly overlapped between different phones. Therefore, we introduce a speaker normalization technique into SSC computation to reduce the speaker variability. Experimental results on spontaneous speech recognition show that the speaker normalized SSCs are more useful as supplementary features for improving the recognition performance than the conventional SSCs. We observed a significant improvement in error rate by 20.3% and 14.3% at SNR=15dB by adding speaker normalized SSCs to the conventional features and by incorporating a speaker normalized technique into the conventional SSCs, respectively.
キーワード(和)	スペクトルサブバンドセントロイド / 雑音環境 / 話者正規化 / 音声認識
キーワード(英)	Spectral subband centroids / Noise environment / Speaker normalization / Speech recognition
資料番号	NLC98-40,SP98-104
発行日

研究会情報
研究会	SP
開催期間	1998/12/10(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Speech (SP)
本文の言語	JPN
タイトル（和）	話者正規化スペクトルサブバンドパラメータを用いた雑音下での音声認識
サブタイトル（和）
タイトル（英）	Speaker normalized spectral subband parameters for noise robust speech recognition
サブタイトル（和）
キーワード(1)（和/英）	スペクトルサブバンドセントロイド / Spectral subband centroids
キーワード(2)（和/英）	雑音環境 / Noise environment
キーワード(3)（和/英）	話者正規化 / Speaker normalization
キーワード(4)（和/英）	音声認識 / Speech recognition
第 1 著者氏名（和/英）	柘植覚 / Satoru Tsuge
第 1 著者所属（和/英）	ATR音声翻訳通信研究所:徳島大学 ATR Interpreting Telecommunications Research Laboratories:Tokushima University
第 2 著者氏名（和/英）	深田俊明 / Toshiaki Fukada
第 2 著者所属（和/英）	ATR音声翻訳通信研究所 ATR Interpreting Telecommunications Research Laboratories
第 3 著者氏名（和/英）	シンガーハラルド / Harald Singer
第 3 著者所属（和/英）	ATR音声翻訳通信研究所 ATR Interpreting Telecommunications Research Laboratories
発表年月日	1998/12/10
資料番号	NLC98-40,SP98-104
巻番号（vol）	vol.98
号番号（no）	462
ページ範囲	pp.-
ページ数	6
発行日