Presentation 2020-03-06
A Comparison Study of Neural Sign Language Translation Methods with Spatio-Temporal Features
Kodai Watanabe, Wataru Kameyama,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In Neural Sign Language Translation, a model based on 2DCNN (2 Dimensional Convolutional Neural Network) called AlexNet and a neural machine translation model called Seq2Seq has been proposed. In this model, temporal information is extracted by GRU (Gated Recurrent Unit) from the features in which the spatial information is lost by 2DCNN. However, since sign language uses position, shape and motion of hands and fingers, a model that can extract temporal information from the features that contain spatial information seems to be more suitable. Therefore, in this paper, we propose various methods and compare them that extract temporal information at the stage of extracting spatial features from each frame of video. As the result of the comparison experiment of the various spatio-temporal feature extractors, it is suggested that the number of to-be-optimized parameters and the performance of sign language translation are inversely proportional on the dataset used in this experiment. That seems the reason why the model using only Optical Flow shows the highest performance in sign language translation because it has the least number of parameters to be trained.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Neural Sign Language Translation / Spatio-temporal Features / DNN / Optical Flow
Paper # IMQ2019-68,IE2019-150,MVE2019-89
Date of Issue 2020-02-27 (IMQ, IE, MVE)

Conference Information
Committee IE / IMQ / MVE / CQ
Conference Date 2020/3/5(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Kyushu Institute of Technology
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Hideaki Kimata(NTT) / Toshiya Nakaguchi(Chiba Univ.) / Kenji Mase(Nagoya Univ.) / Hideyuki Shimonishi(NEC)
Vice Chair Kazuya Kodama(NII) / Keita Takahashi(Nagoya Univ.) / Mitsuru Maeda(Canon) / Kenya Uomori(Osaka Univ.) / Masayuki Ihara(NTT) / Jun Okamoto(NTT) / Takefumi Hiraguri(Nippon Inst. of Tech.)
Secretary Kazuya Kodama(NTT) / Keita Takahashi(NHK) / Mitsuru Maeda(Shizuoka Univ.) / Kenya Uomori(Sony Semiconductor Solutions) / Masayuki Ihara(Nagoya Univ.) / Jun Okamoto(NTT) / Takefumi Hiraguri(Nippon Inst. of Tech.)
Assistant Kyohei Unno(KDDI Research) / Norishige Fukushima(Nagoya Inst. of Tech.) / Hiroaki Kudo(Nagoya Univ.) / Masaru Tsuchida(NTT) / Keita Hirai(Chiba Univ.) / Satoshi Nishiguchi(Oosaka Inst. of Tech.) / Masanori Yokoyama(NTT) / Shogo Fukushima(Univ. of ToKyo) / Chikara Sasaki(KDDI Research) / Yoshiaki Nishikawa(NEC) / Takuto Kimura(NTT)

Paper Information
Registration To Technical Committee on Image Engineering / Technical Committee on Image Media Quality / Technical Committee on Media Experience and Virtual Environment / Technical Committee on Communication Quality
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Comparison Study of Neural Sign Language Translation Methods with Spatio-Temporal Features
Sub Title (in English)
Keyword(1) Neural Sign Language Translation
Keyword(2) Spatio-temporal Features
Keyword(3) DNN
Keyword(4) Optical Flow
1st Author's Name Kodai Watanabe
1st Author's Affiliation Waseda University(Waseda Univ.)
2nd Author's Name Wataru Kameyama
2nd Author's Affiliation Waseda University(Waseda Univ.)
Date 2020-03-06
Paper # IMQ2019-68,IE2019-150,MVE2019-89
Volume (vol) vol.119
Number (no) IMQ-454,IE-456,MVE-457
Page pp.pp.273-278(IMQ), pp.273-278(IE), pp.273-278(MVE),
#Pages 6
Date of Issue 2020-02-27 (IMQ, IE, MVE)