Acoustic Modeling Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis and Voice Conversion

講演名	2013-12-20 Acoustic Modeling Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis and Voice Conversion ,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)
抄録(英)	This paper summarizes our previous work on spectral modeling using restricted Boltzmann machines (RBM) and deep belief networks (DBN) for statistical parametric speech synthesis and voice conversion. This approach improves the conventional methods in two ways. First, the raw spectral envelopes extracted by the STRAIGHT vocoder are used as the features for spectral modeling. Second, instead of using single Gaussian distribution, we adopt RBMs or DBNs to represent the distribution of the envelopes at each HMM state or GMM mixture. Our experimental results show the effectiveness of this proposed method in improving the naturalness and similarity of the generated speech.
キーワード(和)
キーワード(英)	Speech synthesis / voice conversion / restricted Boltzmann machine / deep Belief network / hidden Markov model / Gaussian mixture model
資料番号	SP2013-90
発行日

講演論文情報詳細
申込み研究会	Speech (SP)
本文の言語	ENG
タイトル（和）
サブタイトル（和）
タイトル（英）	Acoustic Modeling Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis and Voice Conversion
サブタイトル（和）
キーワード(1)（和/英）	/ Speech synthesis
第 1 著者氏名（和/英）	/ Zhen-Hua Ling
第 1 著者所属（和/英）	National Engineering Laboratory for Speech and Language Information Processing University of Science and Technology of China
発表年月日	2013-12-20
資料番号	SP2013-90
巻番号（vol）	vol.113
号番号（no）	366
ページ範囲	pp.-
ページ数	6
発行日