Presentation 2016-11-16
Statistical Mechanical Analysis of Fast Online Learning with Weight Normalization
Yuki Yoshida, Ryo Karakida, Masato Okada, Shun-ichi Amari,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Weight normalization (WN), a newly developed optimization algorithm for neural networks by Salimans & Kingma(2016), factorizes the weight vector of a neural network into a radial length and a direction vector, and the factorized parameters follow their steepest gradient descent update. They showed that learning with WN yields better converging speed in several practical tasks including image recognition and reinforcement learning than learning with the conventional steepest descent. However, it remains theoretically unclear why this method works well. In this study, we used a statistical mechanical approach to analyze on-line learning in single layer linear and nonlinear perceptrons with WN. By deriving order parameters of the dynamics of learning, we confirmed quantitatively that WN achieves fast converging speed by automatically tuning the effective learning rate, irrespective of the nonlinearity of the neural network. This fast converging is realized when the initial value of the radial length is near the global minimum; therefore, our theory suggests that it is important to choose the initial value of the radial length appropriately when using WN.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Neural network / Weight normalization / Online learning / Statistical mechanics
Paper # IBISML2016-60
Date of Issue 2016-11-09 (IBISML)

Conference Information
Committee IBISML
Conference Date 2016/11/16(3days)
Place (in Japanese) (See Japanese page)
Place (in English) Kyoto Univ.
Topics (in Japanese) (See Japanese page)
Topics (in English) Information-Based Induction Science Workshop (IBIS2016)
Chair Kenji Fukumizu(ISM)
Vice Chair Masashi Sugiyama(Univ. of Tokyo) / Hisashi Kashima(Kyoto Univ.)
Secretary Masashi Sugiyama(Univ. of Tokyo) / Hisashi Kashima(Nagoya Inst. of Tech.)
Assistant Toshihiro Kamishima(AIST) / Tomoharu Iwata(NTT)

Paper Information
Registration To Technical Committee on Infomation-Based Induction Sciences and Machine Learning
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Statistical Mechanical Analysis of Fast Online Learning with Weight Normalization
Sub Title (in English)
Keyword(1) Neural network
Keyword(2) Weight normalization
Keyword(3) Online learning
Keyword(4) Statistical mechanics
1st Author's Name Yuki Yoshida
1st Author's Affiliation The University of Tokyo(UTokyo)
2nd Author's Name Ryo Karakida
2nd Author's Affiliation The University of Tokyo(UTokyo)
3rd Author's Name Masato Okada
3rd Author's Affiliation The University of Tokyo(UTokyo)
4th Author's Name Shun-ichi Amari
4th Author's Affiliation RIKEN(RIKEN)
Date 2016-11-16
Paper # IBISML2016-60
Volume (vol) vol.116
Number (no) IBISML-300
Page pp.pp.101-108(IBISML),
#Pages 8
Date of Issue 2016-11-09 (IBISML)