Learning in Two-Player Matrix Games by Policy Gradient Lagging Anchor

講演名	2018-03-12 Learning in Two-Player Matrix Games by Policy Gradient Lagging Anchor 丁世堯(阪大), 潮俊光(阪大),
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	We propose a novel multi-agent reinforcement learning (MARL) algorithm which is called a policy gra-dient lagging anchor (PGLA) algorithm. Then, we consider 2 two-player matrix games as illustrative examples. Andit is shown by simulation that behaviors of the games using the PGLA algorithm can converge to Nash equilibriain both pure and mixed policies.
抄録(英)	We propose a novel multi-agent reinforcement learning (MARL) algorithm which is called a policy gra-dient lagging anchor (PGLA) algorithm. Then, we consider 2 two-player matrix games as illustrative examples. Andit is shown by simulation that behaviors of the games using the PGLA algorithm can converge to Nash equilibriain both pure and mixed policies.
キーワード(和)	Reinforcement Learning / Policy Gradient / Multi-Agent Systems / Matrix Game
キーワード(英)	Reinforcement Learning / Policy Gradient / Multi-Agent Systems / Matrix Game
資料番号	MSS2017-79
発行日	2018-03-05 (MSS)

研究会情報
研究会	MSS / NLP
開催期間	2018/3/12(から3日開催)
開催地（和）	大阪大学豊中キャンパス
開催地（英）
テーマ（和）	SICE-DES研究会，IEICE-NLP, MSSの3研究会併催，一般および Work In Progress(WIP) ※(WIPセッションはDES,MSSのみ．「詳細はこちら」参照)
テーマ（英）
委員長氏名（和）	名嘉村盛和(琉球大) / 安達雅春(東京電機大)
委員長氏名（英）	Morikazu Nakamura(Univ. of Ryukyus) / Masaharu Adachi(Tokyo Denki Univ.)
副委員長氏名（和）	髙井重昌(阪大) / 高橋規一(岡山大)
副委員長氏名（英）	Shigemasa Takai(Osaka Univ.) / Norikazu Takahashi(Okayama Univ.)
幹事氏名（和）	豊嶋伊知郎(東芝エネルギーシステムズ) / 金澤尚史(阪大) / 坪根正(長岡技科大) / 山内将行(広島工大)
幹事氏名（英）	Ichiro Toyoshima(Toshiba) / Takahumi Kanazawa(Osaka Univ.) / Tadashi Tsubone(Nagaoka Univ. of Tech.) / Masayuki Yamauchi(Hiroshima Inst. of Tech.)
幹事補佐氏名（和）	金城秀樹(沖縄大) / 橘俊宏(湘南工科大) / 木村真之(京大)
幹事補佐氏名（英）	Hideki Kinjo(Okinawa Univ.) / Toshihiro Tachibana(Shonan Inst. of Tech.) / Masayuki Kimura(Kyoto Univ.)

講演論文情報詳細
申込み研究会	Technical Committee on Mathematical Systems Science and its applications / Technical Committee on Nonlinear Problems
本文の言語	ENG
タイトル（和）
サブタイトル（和）
タイトル（英）	Learning in Two-Player Matrix Games by Policy Gradient Lagging Anchor
サブタイトル（和）
キーワード(1)（和/英）	Reinforcement Learning / Reinforcement Learning
キーワード(2)（和/英）	Policy Gradient / Policy Gradient
キーワード(3)（和/英）	Multi-Agent Systems / Multi-Agent Systems
キーワード(4)（和/英）	Matrix Game / Matrix Game
第 1 著者氏名（和/英）	丁世堯 / Shiyao Ding
第 1 著者所属（和/英）	大阪大学(略称：阪大) Osaka University(略称：Osaka Univ.)
第 2 著者氏名（和/英）	潮俊光 / Toshimitsu Ushio
第 2 著者所属（和/英）	大阪大学(略称：阪大) Osaka University(略称：Osaka Univ.)
発表年月日	2018-03-12
資料番号	MSS2017-79
巻番号（vol）	vol.117
号番号（no）	MSS-506
ページ範囲	pp.11-14(MSS),
ページ数	4
発行日	2018-03-05 (MSS)