MPIによる並列計算を用いたマルチエージェント強化学習のカオス制御への適用

佐藤 倫久; 安達 雅春

講演名	2008-03-28 MPIによる並列計算を用いたマルチエージェント強化学習のカオス制御への適用佐藤倫久, 安達雅春,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	強化学習は試行錯誤的学習アルゴリズムであり,明示的な教師信号を与えずに学習者が行った行動の評価のみを用いる学習法である.そのため望ましい行動を行えるようようになるまでの学習時間が非常に多い。そこで複数のエージェントが同時に学習し,効率よく学習を行えるマルチエージェント強化学習が提案されている.本研究では,共通の記憶を持つマルチエージェント強化学習をMPIによって実現し,カオス制御に適用した.その結果,シングルエージェント強化学習より短時間で同等の性能を示すことがわかった.また,複数回学習を繰り返しても同一の行動を選ぶ量子化状態は,カオス制御において重要な位置に対応する量子化状態である可能性が高いことを示唆する結果を得た.
抄録(英)	A reinforcement learning is a trial and error process, therefore it takes a huge computation time. It multiple agents can learn simultaneously the computational time can be reduced. In this paper we attempt to apply a multi-agent reinforcement learning to chaos control. We implement a multi-agent reinforcement learning using Message-Passing Interface. We compared the control performance of the multi-agent reinforcement learning and that of a single agent reinforcement learning. As a result, the multi-agent reinforcement learning show almost the same performance with the single-agent one. Even if learning iterate for a many times. Moreover, it is suggested that the quantized states whose action after learning by the multi-agent are the same, are important locations for the chaos control.
キーワード(和)	カオスの制御 / 強化学習 / MPI
キーワード(英)	control of chaos / reinforcement lerning / MPI (Message-Passing Interface)
資料番号	NLP2007-169
発行日

研究会情報
研究会	NLP
開催期間	2008/3/21(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Nonlinear Problems (NLP)
本文の言語	JPN
タイトル（和）	MPIによる並列計算を用いたマルチエージェント強化学習のカオス制御への適用
サブタイトル（和）
タイトル（英）	A Multi-agent Reinforcement Learning with Parallel Computation and its Application to Chaos Control
サブタイトル（和）
キーワード(1)（和/英）	カオスの制御 / control of chaos
キーワード(2)（和/英）	強化学習 / reinforcement lerning
キーワード(3)（和/英）	MPI / MPI (Message-Passing Interface)
第 1 著者氏名（和/英）	佐藤倫久 / Norihisa SATO
第 1 著者所属（和/英）	東京電機大学工学部 School of Engineering, Tokyo Denki University
第 2 著者氏名（和/英）	安達雅春 / Masaharu ADACHI
第 2 著者所属（和/英）	東京電機大学工学部 School of Engineering, Tokyo Denki University
発表年月日	2008-03-28
資料番号	NLP2007-169
巻番号（vol）	vol.107
号番号（no）	561
ページ範囲	pp.-
ページ数	6
発行日