Presentation 2023-03-02
Binarization of Vision Transformer with Scaling Factors
Shun Sato, Shun Sawada, Hidefumi Ohmura, Kouichi katsurada,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) 1bit neural network optimization is an optimization technique that achieves a significant increase in computational speed and memory savings by converting the numerical representation of the model to 1bit. Although binarization has been actively studied on convolutional neural network (CNN) few studies have conducted on Vision Transformer (ViT) that has attracted attention as a new image classification model. In this report, we investigate effectiveness of binarization on ViT. The main components of ViT are multi-layer perceptron (MLP) and multi-head self-attention (MHSA), which are repeatedly appeared in the basic blocks of ViT. In the experiment, we examined how the performance of ViT changes according to the components where binarization is applied and the variations of binarization. As a result, we confirmed that binarization with floating-point scaling factor for the convolutional layer is effective. Experimental results also revealed that the performances are improved even if the binarization is applied only to the MLP or the MHSA component.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Convolutional Neural Network / Vision Transformer / Optimization / Binarization
Paper # PRMU2022-83,IBISML2022-90
Date of Issue 2023-02-23 (PRMU, IBISML)

Conference Information
Committee PRMU / IBISML / IPSJ-CVIM
Conference Date 2023/3/2(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Future University Hakodate
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Seiichi Uchida(Kyushu Univ.) / Masashi Sugiyama(Univ. of Tokyo)
Vice Chair Takuya Funatomi(NAIST) / Mitsuru Anpai(Denso IT Lab.) / Toshihiro Kamishima(AIST) / Koji Tsuda(Univ. of Tokyo)
Secretary Takuya Funatomi(CyberAgent) / Mitsuru Anpai(Univ. of Tokyo) / Toshihiro Kamishima(NTT) / Koji Tsuda(Hokkaido Univ.)
Assistant Nakamasa Inoue(Tokyo Inst. of Tech.) / Yasutomo Kawanishi(Riken) / Yoshinobu Kawahara(Osaka Univ.) / Taiji Suzuki(Tokyo Inst. of Tech.)

Paper Information
Registration To Technical Committee on Pattern Recognition and Media Understanding / Technical Committee on Information-Based Induction Sciences and Machine Learning / Special Interest Group on Computer Vision and Image Media
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Binarization of Vision Transformer with Scaling Factors
Sub Title (in English)
Keyword(1) Convolutional Neural Network
Keyword(2) Vision Transformer
Keyword(3) Optimization
Keyword(4) Binarization
1st Author's Name Shun Sato
1st Author's Affiliation Tokyo University of Science(TUS)
2nd Author's Name Shun Sawada
2nd Author's Affiliation Tokyo University of Science(TUS)
3rd Author's Name Hidefumi Ohmura
3rd Author's Affiliation Tokyo University of Science(TUS)
4th Author's Name Kouichi katsurada
4th Author's Affiliation Tokyo University of Science(TUS)
Date 2023-03-02
Paper # PRMU2022-83,IBISML2022-90
Volume (vol) vol.122
Number (no) PRMU-404,IBISML-405
Page pp.pp.134-139(PRMU), pp.134-139(IBISML),
#Pages 6
Date of Issue 2023-02-23 (PRMU, IBISML)