［招待講演］音を作るための拡散確率モデル

Presentation	2023-03-02 [Invited Talk] -- Yuma Koizumi,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Machine learning tasks that deal with acoustic signals can be broadly classified into "recognizing sounds" and "generating sounds". In particular, the latter sound generation tasks, such as text-to-speech (TTS) and speech enhancement (SE), have made remarkable progress in recent years along with the development of deep generative models. This invited talk will introduce what kind of sound generation tasks exist, what kind of problem settings they are, and outline how diffusion models are used in these tasks.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Sound generation / text-to-speech / neural vocoder / denoising diffusion probablistic models
Paper #	PRMU2022-87,IBISML2022-94
Date of Issue	2023-02-23 (PRMU, IBISML)

Conference Information
Committee	PRMU / IBISML / IPSJ-CVIM
Conference Date	2023/3/2(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	Future University Hakodate
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Seiichi Uchida(Kyushu Univ.) / Masashi Sugiyama(Univ. of Tokyo)
Vice Chair	Takuya Funatomi(NAIST) / Mitsuru Anpai(Denso IT Lab.) / Toshihiro Kamishima(AIST) / Koji Tsuda(Univ. of Tokyo)
Secretary	Takuya Funatomi(CyberAgent) / Mitsuru Anpai(Univ. of Tokyo) / Toshihiro Kamishima(NTT) / Koji Tsuda(Hokkaido Univ.)
Assistant	Nakamasa Inoue(Tokyo Inst. of Tech.) / Yasutomo Kawanishi(Riken) / Yoshinobu Kawahara(Osaka Univ.) / Taiji Suzuki(Tokyo Inst. of Tech.)

Paper Information
Registration To	Technical Committee on Pattern Recognition and Media Understanding / Technical Committee on Information-Based Induction Sciences and Machine Learning / Special Interest Group on Computer Vision and Image Media
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	[Invited Talk] --
Sub Title (in English)
Keyword(1)	Sound generation
Keyword(2)	text-to-speech
Keyword(3)	neural vocoder
Keyword(4)	denoising diffusion probablistic models
1st Author's Name	Yuma Koizumi
1st Author's Affiliation	Google Research(Google Research)
Date	2023-03-02
Paper #	PRMU2022-87,IBISML2022-94
Volume (vol)	vol.122
Number (no)	PRMU-404,IBISML-405
Page	pp.pp.149-149(PRMU), pp.149-149(IBISML),
#Pages	1
Date of Issue	2023-02-23 (PRMU, IBISML)