Presentation | 2023-03-02 Binarization of Vision Transformer with Scaling Factors Shun Sato, Shun Sawada, Hidefumi Ohmura, Kouichi katsurada, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | 1bit neural network optimization is an optimization technique that achieves a significant increase in computational speed and memory savings by converting the numerical representation of the model to 1bit. Although binarization has been actively studied on convolutional neural network (CNN) few studies have conducted on Vision Transformer (ViT) that has attracted attention as a new image classification model. In this report, we investigate effectiveness of binarization on ViT. The main components of ViT are multi-layer perceptron (MLP) and multi-head self-attention (MHSA), which are repeatedly appeared in the basic blocks of ViT. In the experiment, we examined how the performance of ViT changes according to the components where binarization is applied and the variations of binarization. As a result, we confirmed that binarization with floating-point scaling factor for the convolutional layer is effective. Experimental results also revealed that the performances are improved even if the binarization is applied only to the MLP or the MHSA component. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Convolutional Neural Network / Vision Transformer / Optimization / Binarization |
Paper # | PRMU2022-83,IBISML2022-90 |
Date of Issue | 2023-02-23 (PRMU, IBISML) |
Conference Information | |
Committee | PRMU / IBISML / IPSJ-CVIM |
---|---|
Conference Date | 2023/3/2(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Future University Hakodate |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | Seiichi Uchida(Kyushu Univ.) / Masashi Sugiyama(Univ. of Tokyo) |
Vice Chair | Takuya Funatomi(NAIST) / Mitsuru Anpai(Denso IT Lab.) / Toshihiro Kamishima(AIST) / Koji Tsuda(Univ. of Tokyo) |
Secretary | Takuya Funatomi(CyberAgent) / Mitsuru Anpai(Univ. of Tokyo) / Toshihiro Kamishima(NTT) / Koji Tsuda(Hokkaido Univ.) |
Assistant | Nakamasa Inoue(Tokyo Inst. of Tech.) / Yasutomo Kawanishi(Riken) / Yoshinobu Kawahara(Osaka Univ.) / Taiji Suzuki(Tokyo Inst. of Tech.) |
Paper Information | |
Registration To | Technical Committee on Pattern Recognition and Media Understanding / Technical Committee on Information-Based Induction Sciences and Machine Learning / Special Interest Group on Computer Vision and Image Media |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Binarization of Vision Transformer with Scaling Factors |
Sub Title (in English) | |
Keyword(1) | Convolutional Neural Network |
Keyword(2) | Vision Transformer |
Keyword(3) | Optimization |
Keyword(4) | Binarization |
1st Author's Name | Shun Sato |
1st Author's Affiliation | Tokyo University of Science(TUS) |
2nd Author's Name | Shun Sawada |
2nd Author's Affiliation | Tokyo University of Science(TUS) |
3rd Author's Name | Hidefumi Ohmura |
3rd Author's Affiliation | Tokyo University of Science(TUS) |
4th Author's Name | Kouichi katsurada |
4th Author's Affiliation | Tokyo University of Science(TUS) |
Date | 2023-03-02 |
Paper # | PRMU2022-83,IBISML2022-90 |
Volume (vol) | vol.122 |
Number (no) | PRMU-404,IBISML-405 |
Page | pp.pp.134-139(PRMU), pp.134-139(IBISML), |
#Pages | 6 |
Date of Issue | 2023-02-23 (PRMU, IBISML) |