Summary
International Symposium on Nonlinear Theory and its Applications
2017
Session Number:A2L-D
Session:
Number:A2L-D-2
Layer Specificity of Acquired Memory Duration in Multilayer LSTM Networks
Kazuki Hatanaka, Jun-Nosuke Teramae, Naoki Wakamiya,
pp.162-165
Publication Date:2017/12/4
Online ISSN:2188-5079
DOI:10.34385/proc.29.A2L-D-2
PDF download (162.5KB)
Summary:
The LSTM network is a recurrent neural network achieving impressive performance on machine learning tasks of sequential data recently. The network consists of many LSTM units that are able to store past inputs into their internal memory variables. If the number of the LSTM units of the network is fixed, achieved performance of the network generally increases as the numbers of layers of the network increases. It remains unclear, however, why deeper LSTM networks can achieve higher performance. As a first step to answer the question, here, we analyze layer-wise difference of temporal dynamics of the memory variable of each unit. We found that units in a deeper layer averagely keep longer memory duration than these in a shallower layer. We also found that memory duration is broadly distributed among units in a deeper layer than a shallower layer. These results mean that units in different layers share different roles of memorization.