Presentation 2023-03-02
[Invited Talk] --
Yuma Koizumi,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Machine learning tasks that deal with acoustic signals can be broadly classified into "recognizing sounds" and "generating sounds". In particular, the latter sound generation tasks, such as text-to-speech (TTS) and speech enhancement (SE), have made remarkable progress in recent years along with the development of deep generative models. This invited talk will introduce what kind of sound generation tasks exist, what kind of problem settings they are, and outline how diffusion models are used in these tasks.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Sound generation / text-to-speech / neural vocoder / denoising diffusion probablistic models
Paper # PRMU2022-87,IBISML2022-94
Date of Issue 2023-02-23 (PRMU, IBISML)

Conference Information
Committee PRMU / IBISML / IPSJ-CVIM
Conference Date 2023/3/2(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Future University Hakodate
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Seiichi Uchida(Kyushu Univ.) / Masashi Sugiyama(Univ. of Tokyo)
Vice Chair Takuya Funatomi(NAIST) / Mitsuru Anpai(Denso IT Lab.) / Toshihiro Kamishima(AIST) / Koji Tsuda(Univ. of Tokyo)
Secretary Takuya Funatomi(CyberAgent) / Mitsuru Anpai(Univ. of Tokyo) / Toshihiro Kamishima(NTT) / Koji Tsuda(Hokkaido Univ.)
Assistant Nakamasa Inoue(Tokyo Inst. of Tech.) / Yasutomo Kawanishi(Riken) / Yoshinobu Kawahara(Osaka Univ.) / Taiji Suzuki(Tokyo Inst. of Tech.)

Paper Information
Registration To Technical Committee on Pattern Recognition and Media Understanding / Technical Committee on Information-Based Induction Sciences and Machine Learning / Special Interest Group on Computer Vision and Image Media
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) [Invited Talk] --
Sub Title (in English)
Keyword(1) Sound generation
Keyword(2) text-to-speech
Keyword(3) neural vocoder
Keyword(4) denoising diffusion probablistic models
1st Author's Name Yuma Koizumi
1st Author's Affiliation Google Research(Google Research)
Date 2023-03-02
Paper # PRMU2022-87,IBISML2022-94
Volume (vol) vol.122
Number (no) PRMU-404,IBISML-405
Page pp.pp.149-149(PRMU), pp.149-149(IBISML),
#Pages 1
Date of Issue 2023-02-23 (PRMU, IBISML)