Presentation 2023-05-18
Streamable gesture generators for low-latency gesture generation
Yuna Mitsubayashi, Naoshi Kaneko, Kazuhiko Sumi,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Currently, conversational agents such as interactive robots are developing remarkably, and not only their dialogue responses but also their behaviour is attracting attention, because gestures accompanying speech are an important element for improving the communicative ability of interactive robots and conversational agents. There are methods for automatically generating gestures using deep learning, but they focus on the quality of the generation and input the chronological past and future of the utterance at once, so the delay in the generation has not been considered. On the other hand, nowadays online conferences are held in the metaverse, and if avatars are able to perform natural gestures, the comprehension of conversations will be improved. To solve this problem, we propose a method for generating gestures simultaneously with speech input. For this purpose, features of speech are learnt in an RNN-Transducer type framework, and sequences of gesture actions are generated sequentially from the speech input. As a result, the time until the first gesture is generated was reduced by approximately one second compared to the conventional method, but the quality of the gestures left room for improvement.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Gesture Generation / RNN-Transducer model / Deep learning / Conversational Agents
Paper # PRMU2023-4
Date of Issue 2023-05-11 (PRMU)

Conference Information
Committee PRMU / IPSJ-CVIM
Conference Date 2023/5/18(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Seiichi Uchida(Kyushu Univ.)
Vice Chair Takuya Funatomi(NAIST) / Mitsuru Anpai(Denso IT Lab.)
Secretary Takuya Funatomi(CyberAgent) / Mitsuru Anpai(Univ. of Tokyo)
Assistant Nakamasa Inoue(Tokyo Inst. of Tech.) / Yasutomo Kawanishi(Riken)

Paper Information
Registration To Technical Committee on Pattern Recognition and Media Understanding / Special Interest Group on Computer Vision and Image Media
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Streamable gesture generators for low-latency gesture generation
Sub Title (in English)
Keyword(1) Gesture Generation
Keyword(2) RNN-Transducer model
Keyword(3) Deep learning
Keyword(4) Conversational Agents
1st Author's Name Yuna Mitsubayashi
1st Author's Affiliation Aoyama Gakuin University(Aoyama Gakuin Univ.)
2nd Author's Name Naoshi Kaneko
2nd Author's Affiliation Aoyama Gakuin University(Aoyama Gakuin Univ.)
3rd Author's Name Kazuhiko Sumi
3rd Author's Affiliation Aoyama Gakuin University(Aoyama Gakuin Univ.)
Date 2023-05-18
Paper # PRMU2023-4
Volume (vol) vol.123
Number (no) PRMU-30
Page pp.pp.16-21(PRMU),
#Pages 6
Date of Issue 2023-05-11 (PRMU)