［招待講演］深層学習を利用した多様な音声の合成・認識・変換と応用

能勢 隆

講演名	2017-07-27 ［招待講演］深層学習を利用した多様な音声の合成・認識・変換と応用能勢隆(東北大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	本稿では，ここ数年で急速に加速している深層学習を利用した音声情報処理研究のうち，多様な音声の合成・認識・変換に焦点を当て，筆者らの取り組みも含めそれらの研究例および応用例の一部を紹介する．具体的にはまず多様な話者性，感情表現などを含んだ音声の合成手法およびユーザが柔軟に韻律を制御できるテーラーメイド音声合成について述べる．次に音声合成も含め感情音声を利用した研究において重要な感情音声データベース「JTES」の構築について紹介し，JTESを利用した感情音声の認識について触れる．また，任意話者からの声質変換や歌声合成・顔動画像合成への応用などについても概説するとともに，今後の研究について課題や展望を述べる．
抄録(英)	This paper focuses on synthesis, recognition and conversion of various speech in the speech processing using deep learning whose research is rapidly accelerating in recent years, and introduces the part of the examples of research and application including our work. Specifically, first I describe the synthesis technique of speech having various individuality and emotional expressions, and tailor-made speech synthesis that enables users to flexibly control the prosody. Next, I introduce an emotional speech database "JTES" that is important for the research using emotional speech including speech synthesis, and refer to the recognition of emotional speech using JTES. I also outline the voice conversion from arbitrary speakers and the applications of deep learning to singing voice synthesis and facial animation generation. Finally, I conclude this talk with remaining issues and prospects for the future research.
キーワード(和)	深層学習 / 感情音声合成 / テーラーメイド音声合成 / 感情音声認識 / 声質変換 / 顔動画像生成
キーワード(英)	deep learning / emotional speech synthesis / tailor-made speech synthesis / emotional speech recognition / voice conversion / facial animation generation
資料番号	SP2017-16
発行日	2017-07-20 (SP)

研究会情報
研究会	SP / IPSJ-SLP
開催期間	2017/7/27(から2日開催)
開催地（和）	秋保リゾートホテルクレセント
開催地（英）	Akiu Resort Hotel Crescent
テーマ（和）	認識，理解，対話，一般
テーマ（英）	Speech recognition and understanding, dialog system, etc.
委員長氏名（和）	山下洋一(立命館大) / 峯松信明(東大)
委員長氏名（英）	Yoichi Yamashita(Ritsumeikan Univ.) / Nobuaki Minematsu(Univ. of Tokyo)
副委員長氏名（和）	森大毅(宇都宮大)
副委員長氏名（英）	Hiroki Mori(Utsunomiya Univ.)
幹事氏名（和）	西田昌史(静岡大) / 坂野秀樹(名城大) / 篠崎隆宏(東工大) / 山岸順一(NII) / 福田隆(IBM)
幹事氏名（英）	Masafumi Nishida(Shizuoka Univ.) / Hideki Banno(Meijo Univ.) / Takahiro Shinozaki(Tokyo Inst. of Tech.) / Junichi Yamagishi(NII) / Takashi Fukuda(IBM)
幹事補佐氏名（和）	橋本佳(名工大) / 小橋川哲(NTT)
幹事補佐氏名（英）	Kei Hashimoto(Nagoya Inst. of Tech.) / Satoshi Kobashikawa(NTT)

講演論文情報詳細
申込み研究会	Technical Committee on Speech / Special Interest Group on Spoken Language Processing
本文の言語	JPN
タイトル（和）	［招待講演］深層学習を利用した多様な音声の合成・認識・変換と応用
サブタイトル（和）
タイトル（英）	[Invited Talk] Synthesis, Recognition and Conversion of Various Speech Using Deep Learning and Their Applications
サブタイトル（和）
キーワード(1)（和/英）	深層学習 / deep learning
キーワード(2)（和/英）	感情音声合成 / emotional speech synthesis
キーワード(3)（和/英）	テーラーメイド音声合成 / tailor-made speech synthesis
キーワード(4)（和/英）	感情音声認識 / emotional speech recognition
キーワード(5)（和/英）	声質変換 / voice conversion
キーワード(6)（和/英）	顔動画像生成 / facial animation generation
第 1 著者氏名（和/英）	能勢隆 / Takashi Nose
第 1 著者所属（和/英）	東北大学(略称：東北大) Tohoku University(略称：Tohoku Univ.)
発表年月日	2017-07-27
資料番号	SP2017-16
巻番号（vol）	vol.117
号番号（no）	SP-160
ページ範囲	pp.3-8(SP),
ページ数	6
発行日	2017-07-20 (SP)