低遅延なジェスチャ生成のための逐次的な生成器の提案

光林 優菜; 金子 直史; 鷲見 和彦

Presentation	2023-05-18 Streamable gesture generators for low-latency gesture generation Yuna Mitsubayashi, Naoshi Kaneko, Kazuhiko Sumi,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Currently, conversational agents such as interactive robots are developing remarkably, and not only their dialogue responses but also their behaviour is attracting attention, because gestures accompanying speech are an important element for improving the communicative ability of interactive robots and conversational agents. There are methods for automatically generating gestures using deep learning, but they focus on the quality of the generation and input the chronological past and future of the utterance at once, so the delay in the generation has not been considered. On the other hand, nowadays online conferences are held in the metaverse, and if avatars are able to perform natural gestures, the comprehension of conversations will be improved. To solve this problem, we propose a method for generating gestures simultaneously with speech input. For this purpose, features of speech are learnt in an RNN-Transducer type framework, and sequences of gesture actions are generated sequentially from the speech input. As a result, the time until the first gesture is generated was reduced by approximately one second compared to the conventional method, but the quality of the gestures left room for improvement.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Gesture Generation / RNN-Transducer model / Deep learning / Conversational Agents
Paper #	PRMU2023-4
Date of Issue	2023-05-11 (PRMU)

Conference Information
Committee	PRMU / IPSJ-CVIM
Conference Date	2023/5/18(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Seiichi Uchida(Kyushu Univ.)
Vice Chair	Takuya Funatomi(NAIST) / Mitsuru Anpai(Denso IT Lab.)
Secretary	Takuya Funatomi(CyberAgent) / Mitsuru Anpai(Univ. of Tokyo)
Assistant	Nakamasa Inoue(Tokyo Inst. of Tech.) / Yasutomo Kawanishi(Riken)

Paper Information
Registration To	Technical Committee on Pattern Recognition and Media Understanding / Special Interest Group on Computer Vision and Image Media
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Streamable gesture generators for low-latency gesture generation
Sub Title (in English)
Keyword(1)	Gesture Generation
Keyword(2)	RNN-Transducer model
Keyword(3)	Deep learning
Keyword(4)	Conversational Agents
1st Author's Name	Yuna Mitsubayashi
1st Author's Affiliation	Aoyama Gakuin University(Aoyama Gakuin Univ.)
2nd Author's Name	Naoshi Kaneko
2nd Author's Affiliation	Aoyama Gakuin University(Aoyama Gakuin Univ.)
3rd Author's Name	Kazuhiko Sumi
3rd Author's Affiliation	Aoyama Gakuin University(Aoyama Gakuin Univ.)
Date	2023-05-18
Paper #	PRMU2023-4
Volume (vol)	vol.123
Number (no)	PRMU-30
Page	pp.pp.16-21(PRMU),
#Pages	6
Date of Issue	2023-05-11 (PRMU)