講演名 2013-12-20
A UNIFIED TRAJECTORY TILING APPROACH TO HIGH QUALITY SPEECH RENDERING
,
PDFダウンロードページ PDFダウンロードページへ
抄録(和)
抄録(英) In human-machine speech communication, it is challenging to make the machine talk as naturally as human to facilitate "frictionless" human-machine interactions. In this paper, we introduce a "trajectory tiling" based approach to high quality speech rendering, where speech parameter trajectories, extracted from natural, processed, or synthesized speech, are used to guide the search for the best sequence of waveform "tiles" stored in a pre-recorded speech database. The good "tile" candidates at phone, state or frame are then used to construct a lattice-like "sausage". In the sausage, the best path of concatenated tiles is then searched via the Viterbi algorithm. The best string of concatenated waveform segments (tiles) is output as the final rendered speech. The proposed trajectory tiling approach to speech rendering has been tested in three tasks: TTS synthesis, cross-lingual voice transformation for personalized speech-to-speech translation and mixed-code TTS synthesis. Experimental results show that the trajectory tiling approach can yield speech which is natural and highly intelligible. The perceived high quality of rendered speech is also confirmed in both objective and subjective evaluations.
キーワード(和)
キーワード(英)
資料番号 SP2013-91
発行日

研究会情報
研究会 SP
開催期間 2013/12/12(から1日開催)
開催地(和)
開催地(英)
テーマ(和)
テーマ(英)
委員長氏名(和)
委員長氏名(英)
副委員長氏名(和)
副委員長氏名(英)
幹事氏名(和)
幹事氏名(英)
幹事補佐氏名(和)
幹事補佐氏名(英)

講演論文情報詳細
申込み研究会 Speech (SP)
本文の言語 ENG
タイトル(和)
サブタイトル(和)
タイトル(英) A UNIFIED TRAJECTORY TILING APPROACH TO HIGH QUALITY SPEECH RENDERING
サブタイトル(和)
キーワード(1)(和/英)
第 1 著者 氏名(和/英) / Yao Qian
第 1 著者 所属(和/英)
Microsoft Research
発表年月日 2013-12-20
資料番号 SP2013-91
巻番号(vol) vol.113
号番号(no) 366
ページ範囲 pp.-
ページ数 6
発行日