平均声モデル構築におけるコンテキストクラスタリングと話者適応学習の検討

山岸 順一; 田村 正統; 益子 貴史; 小林 隆夫; 徳田 恵一

講演名	2002/8/23 平均声モデル構築におけるコンテキストクラスタリングと話者適応学習の検討山岸順一, 田村正統, 益子貴史, 小林隆夫, 徳田恵一,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	本論文では,話者適応を前提とした平均声モデルの学習法について検討を行う.平均声モデルの各分布の学習データ量は各学習話者に対して均一ではなく,分布に話者や性別の偏りが生じることがある.提案手法では,話者による変動の影響を低減するため,スペクトル部,F_0部,状態継続長部の分布の共有に共有決定木コンテキストクラスタリングを,スペクトル部,F_0部の分布共有後のパラメータ再推定に話者適応学習を導入する.主観評価試験より,提案法により学習した平均声モデルは従来法より自然性の高い平均声を合成できること,また話者適応後の合成音声は従来法より目標話者に近づき,自然性が増すことが示された.
抄録(英)	This paper describes a new training method of average voice model for speech synthesis using speaker adaptation. When the amount of training data is limited, it would occur that the distributions of average voice model have bias depending on speaker and/or gender. In the proposed method, to reduce the influence of speaker dependence, we incorporate a context clustering technique called shared decision tree context clustering and speaker adaptive training into the training procedure of average voice model. From the results of subjective tests, we show that the average voice model trained using the proposed method can generate more natural sounding speech than the conventional average voice model. Moreover, it is shown that voice characteristics of synthetic speech generated from the adapted model using the proposed method is closer to the target speaker's voice than the conventional method.
キーワード(和)	HMM音声合成 / 平均声モデル / 決定木に基づくコンテキストクラスタリング / 話者適応学習 / 話者適応
キーワード(英)	HMM-based speech synthesis / average voice model / decision tree based context clustering / speaker adaptive training / speaker adaptation
資料番号	SP2002-72
発行日

研究会情報
研究会	SP
開催期間	2002/8/23(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Speech (SP)
本文の言語	JPN
タイトル（和）	平均声モデル構築におけるコンテキストクラスタリングと話者適応学習の検討
サブタイトル（和）
タイトル（英）	A Study on Context Clustering Techniques and Speaker Adaptive Training for Average Voice Model
サブタイトル（和）
キーワード(1)（和/英）	HMM音声合成 / HMM-based speech synthesis
キーワード(2)（和/英）	平均声モデル / average voice model
キーワード(3)（和/英）	決定木に基づくコンテキストクラスタリング / decision tree based context clustering
キーワード(4)（和/英）	話者適応学習 / speaker adaptive training
キーワード(5)（和/英）	話者適応 / speaker adaptation
第 1 著者氏名（和/英）	山岸順一 / Junichi YAMAGISHI
第 1 著者所属（和/英）	東京工業大学大学院総合理工学研究科 Interdisciplinary Graduate School of Science and Engineering
第 2 著者氏名（和/英）	田村正統 / Masatsune TAMURA
第 2 著者所属（和/英）	東京工業大学大学院総合理工学研究科:(現)東芝研究開発センター Interdisciplinary Graduate School of Science and Engineering:Presently with Corporate Research & Development Center
第 3 著者氏名（和/英）	益子貴史 / Takashi MASUKO
第 3 著者所属（和/英）	東京工業大学大学院総合理工学研究科 Interdisciplinary Graduate School of Science and Engineering
第 4 著者氏名（和/英）	小林隆夫 / Takao KOBAYASHI
第 4 著者所属（和/英）	東京工業大学大学院総合理工学研究科 Interdisciplinary Graduate School of Science and Engineering
第 5 著者氏名（和/英）	徳田恵一 / Keiichi TOKUDA
第 5 著者所属（和/英）	名古屋工業大学知能情報システム学科 Department of Computer Science, Nagoya Institute of Technology
発表年月日	2002/8/23
資料番号	SP2002-72
巻番号（vol）	vol.102
号番号（no）	292
ページ範囲	pp.-
ページ数	6
発行日