競合学習で特徴抽出を行う動的強化学習ネットワーク

小松 泰士; 山内 ゆかり

講演名	2020-03-06 競合学習で特徴抽出を行う動的強化学習ネットワーク小松泰士(日大), 山内ゆかり(日大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	Deep Q-Networkとは，状態空間の入力から畳み込みによって特徴抽出を行う強化学習アルゴリズムである．事前の知識情報なしで複雑な環境のルールや行動の優先度を学習する特徴がある．しかし，特徴抽出後の全結合層の規模の増大により処理が重くなる問題が存在する．そこで本研究では，特徴抽出後の次元数を動的に増減させ、全結合層の勾配降下法とは独立して特徴抽出部分の重みを更新させることで処理速度と学習効率の向上を目指す．従来手法における特徴抽出部分に自己増殖型ニューラルネットワーク（SOINN）を採用し、動的に特徴抽出後の次元数を変化させて学習を行うDynamic Q-Networkを提案する．実験にはReversiを用いてQ-Network自身の学習を行い，ランダムに打つ相手との勝率と処理時間の面でアルゴリズムの比較を行う．提案手法では処理時間と学習速度，学習率のすべての面で従来手法を上回った．特に処理時間は従来手法の約2/3まで減少した．
抄録(英)	Deep Q-Network is a reinforcement learning algorithm that performs feature extraction by convolution from state space input. This method has the feature of learning the rules of complex environments and priorities of actions without prior knowledge information. However, there is a problem that the scale of the fully connected layer after the feature extraction becomes too large and the processing becomes heavy. Therefore, in this study, learning is performed by dynamically increasing or decreasing the number of dimensions after feature extraction. In addition, we aim to improve the processing speed and learning efficiency by updating the weight of the feature extraction part independently of the gradient descent method of the fully connected layer. We adopt a Self-Organizing Incremental Neural Network (SOINN) for the feature extraction part in the conventional method. We propose Dynamic Q-Network which performs learning by dynamically changing the number of dimensions after feature extraction. In the experiment, Q-Network uses Reversi to learn by playing against itself. For the evaluation, the winning rate with the opponent hitting at random and the processing time are used. In the proposed method, the processing time, the learning speed, and the learning rate were all superior to the conventional method. In particular, the processing time was reduced to about 2/3 of the conventional method.
キーワード(和)	強化学習 / 畳み込みニューラルネットワーク / 自己増殖型ニューラルネットワーク
キーワード(英)	Reinforcement Learning / Convolutional Neural Network / Self-Organizing Incremental Neural Network
資料番号	NC2019-106
発行日	2020-02-26 (NC)

研究会情報
研究会	NC / MBE
開催期間	2020/3/4(から3日開催)
開催地（和）	電気通信大学
開催地（英）	University of Electro Communications
テーマ（和）	NC, ME, 一般
テーマ（英）	Neuro Computing, Medical Engineering, etc.
委員長氏名（和）	庄野逸(電通大) / 野村泰伸(阪大)
委員長氏名（英）	Hayaru Shouno(UEC) / Taishin Nomura(Osaka Univ.)
副委員長氏名（和）	鮫島和行(玉川大) / 渡邊高志(東北大)
副委員長氏名（英）	Kazuyuki Samejima(Tamagawa Univ) / Takashi Watanabe(Tohoku Univ.)
幹事氏名（和）	吉本潤一郎(奈良先端大) / 安部川直稔(NTT) / 伊良皆啓治(九大)
幹事氏名（英）	Junichiro Yoshimoto(NAIST) / Naotoshi Abekawa(NTT) / Keiji Iramina(Kyushu Univ.)
幹事補佐氏名（和）	篠崎隆志(NICT) / 瀧山健(東京農工大) / 鈴木康之(阪大) / 辛島彰洋(東北工大)
幹事補佐氏名（英）	Takashi Shinozaki(NICT) / Ken Takiyama(TUAT) / Yasuyuki Suzuki(Osaka Univ.) / Akihiro Karashima(Tohoku Inst. of Tech.)

講演論文情報詳細
申込み研究会	Technical Committee on Neurocomputing / Technical Committee on ME and Bio Cybernetics
本文の言語	JPN
タイトル（和）	競合学習で特徴抽出を行う動的強化学習ネットワーク
サブタイトル（和）
タイトル（英）	Feature Extraction by Competitive Learning for Dynamic Q-Network
サブタイトル（和）
キーワード(1)（和/英）	強化学習 / Reinforcement Learning
キーワード(2)（和/英）	畳み込みニューラルネットワーク / Convolutional Neural Network
キーワード(3)（和/英）	自己増殖型ニューラルネットワーク / Self-Organizing Incremental Neural Network
第 1 著者氏名（和/英）	小松泰士 / Taishi Komatsu
第 1 著者所属（和/英）	日本大学(略称：日大) Nihon University(略称：NU)
第 2 著者氏名（和/英）	山内ゆかり / Yukari Yamauchi
第 2 著者所属（和/英）	日本大学(略称：日大) Nihon University(略称：NU)
発表年月日	2020-03-06
資料番号	NC2019-106
巻番号（vol）	vol.119
号番号（no）	NC-453
ページ範囲	pp.175-179(NC),
ページ数	5
発行日	2020-02-26 (NC)