Presentation | 2020-03-06 A Comparison Study of Neural Sign Language Translation Methods with Spatio-Temporal Features Kodai Watanabe, Wataru Kameyama, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In Neural Sign Language Translation, a model based on 2DCNN (2 Dimensional Convolutional Neural Network) called AlexNet and a neural machine translation model called Seq2Seq has been proposed. In this model, temporal information is extracted by GRU (Gated Recurrent Unit) from the features in which the spatial information is lost by 2DCNN. However, since sign language uses position, shape and motion of hands and fingers, a model that can extract temporal information from the features that contain spatial information seems to be more suitable. Therefore, in this paper, we propose various methods and compare them that extract temporal information at the stage of extracting spatial features from each frame of video. As the result of the comparison experiment of the various spatio-temporal feature extractors, it is suggested that the number of to-be-optimized parameters and the performance of sign language translation are inversely proportional on the dataset used in this experiment. That seems the reason why the model using only Optical Flow shows the highest performance in sign language translation because it has the least number of parameters to be trained. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Neural Sign Language Translation / Spatio-temporal Features / DNN / Optical Flow |
Paper # | IMQ2019-68,IE2019-150,MVE2019-89 |
Date of Issue | 2020-02-27 (IMQ, IE, MVE) |
Conference Information | |
Committee | IE / IMQ / MVE / CQ |
---|---|
Conference Date | 2020/3/5(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Kyushu Institute of Technology |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Hideaki Kimata(NTT) / Toshiya Nakaguchi(Chiba Univ.) / Kenji Mase(Nagoya Univ.) / Hideyuki Shimonishi(NEC) |
Vice Chair | Kazuya Kodama(NII) / Keita Takahashi(Nagoya Univ.) / Mitsuru Maeda(Canon) / Kenya Uomori(Osaka Univ.) / Masayuki Ihara(NTT) / Jun Okamoto(NTT) / Takefumi Hiraguri(Nippon Inst. of Tech.) |
Secretary | Kazuya Kodama(NTT) / Keita Takahashi(NHK) / Mitsuru Maeda(Shizuoka Univ.) / Kenya Uomori(Sony Semiconductor Solutions) / Masayuki Ihara(Nagoya Univ.) / Jun Okamoto(NTT) / Takefumi Hiraguri(Nippon Inst. of Tech.) |
Assistant | Kyohei Unno(KDDI Research) / Norishige Fukushima(Nagoya Inst. of Tech.) / Hiroaki Kudo(Nagoya Univ.) / Masaru Tsuchida(NTT) / Keita Hirai(Chiba Univ.) / Satoshi Nishiguchi(Oosaka Inst. of Tech.) / Masanori Yokoyama(NTT) / Shogo Fukushima(Univ. of ToKyo) / Chikara Sasaki(KDDI Research) / Yoshiaki Nishikawa(NEC) / Takuto Kimura(NTT) |
Paper Information | |
Registration To | Technical Committee on Image Engineering / Technical Committee on Image Media Quality / Technical Committee on Media Experience and Virtual Environment / Technical Committee on Communication Quality |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Comparison Study of Neural Sign Language Translation Methods with Spatio-Temporal Features |
Sub Title (in English) | |
Keyword(1) | Neural Sign Language Translation |
Keyword(2) | Spatio-temporal Features |
Keyword(3) | DNN |
Keyword(4) | Optical Flow |
1st Author's Name | Kodai Watanabe |
1st Author's Affiliation | Waseda University(Waseda Univ.) |
2nd Author's Name | Wataru Kameyama |
2nd Author's Affiliation | Waseda University(Waseda Univ.) |
Date | 2020-03-06 |
Paper # | IMQ2019-68,IE2019-150,MVE2019-89 |
Volume (vol) | vol.119 |
Number (no) | IMQ-454,IE-456,MVE-457 |
Page | pp.pp.273-278(IMQ), pp.273-278(IE), pp.273-278(MVE), |
#Pages | 6 |
Date of Issue | 2020-02-27 (IMQ, IE, MVE) |