Vision Transformerの係数付き1bit化

佐藤 駿; 澤田 隼; 大村 英史; 桂田 浩一

Presentation	2023-03-02 Binarization of Vision Transformer with Scaling Factors Shun Sato, Shun Sawada, Hidefumi Ohmura, Kouichi katsurada,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	1bit neural network optimization is an optimization technique that achieves a significant increase in computational speed and memory savings by converting the numerical representation of the model to 1bit. Although binarization has been actively studied on convolutional neural network (CNN) few studies have conducted on Vision Transformer (ViT) that has attracted attention as a new image classification model. In this report, we investigate effectiveness of binarization on ViT. The main components of ViT are multi-layer perceptron (MLP) and multi-head self-attention (MHSA), which are repeatedly appeared in the basic blocks of ViT. In the experiment, we examined how the performance of ViT changes according to the components where binarization is applied and the variations of binarization. As a result, we confirmed that binarization with floating-point scaling factor for the convolutional layer is effective. Experimental results also revealed that the performances are improved even if the binarization is applied only to the MLP or the MHSA component.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Convolutional Neural Network / Vision Transformer / Optimization / Binarization
Paper #	PRMU2022-83,IBISML2022-90
Date of Issue	2023-02-23 (PRMU, IBISML)

Conference Information
Committee	PRMU / IBISML / IPSJ-CVIM
Conference Date	2023/3/2(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	Future University Hakodate
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Seiichi Uchida(Kyushu Univ.) / Masashi Sugiyama(Univ. of Tokyo)
Vice Chair	Takuya Funatomi(NAIST) / Mitsuru Anpai(Denso IT Lab.) / Toshihiro Kamishima(AIST) / Koji Tsuda(Univ. of Tokyo)
Secretary	Takuya Funatomi(CyberAgent) / Mitsuru Anpai(Univ. of Tokyo) / Toshihiro Kamishima(NTT) / Koji Tsuda(Hokkaido Univ.)
Assistant	Nakamasa Inoue(Tokyo Inst. of Tech.) / Yasutomo Kawanishi(Riken) / Yoshinobu Kawahara(Osaka Univ.) / Taiji Suzuki(Tokyo Inst. of Tech.)

Paper Information
Registration To	Technical Committee on Pattern Recognition and Media Understanding / Technical Committee on Information-Based Induction Sciences and Machine Learning / Special Interest Group on Computer Vision and Image Media
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Binarization of Vision Transformer with Scaling Factors
Sub Title (in English)
Keyword(1)	Convolutional Neural Network
Keyword(2)	Vision Transformer
Keyword(3)	Optimization
Keyword(4)	Binarization
1st Author's Name	Shun Sato
1st Author's Affiliation	Tokyo University of Science(TUS)
2nd Author's Name	Shun Sawada
2nd Author's Affiliation	Tokyo University of Science(TUS)
3rd Author's Name	Hidefumi Ohmura
3rd Author's Affiliation	Tokyo University of Science(TUS)
4th Author's Name	Kouichi katsurada
4th Author's Affiliation	Tokyo University of Science(TUS)
Date	2023-03-02
Paper #	PRMU2022-83,IBISML2022-90
Volume (vol)	vol.122
Number (no)	PRMU-404,IBISML-405
Page	pp.pp.134-139(PRMU), pp.134-139(IBISML),
#Pages	6
Date of Issue	2023-02-23 (PRMU, IBISML)