排他的報酬環境における強化学習エージェントの進化

早川 充洋; 前川 聡; 石井 信

講演名	2003/3/12 排他的報酬環境における強化学習エージェントの進化早川充洋, 前川聡, 石井信,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	強化学習は,即時報酬の時間累積和を最大化するような行動系列を,試行錯誤を通して獲得する学習の枠組みである.明示的な教師を必要としない学習法であるので,複雑な問題にも適用が可能である.一般に実問題では,タスク全体が幾つかのサプタスクに分割され,全てのサブタスクがある順序で達成されるとタスクが達成されるような場合が多い.本研究では,複数のサプタスクからタスクが構成されている環境に対し,状態空間の分割と学習器の階層化を,自律的に行う手法を提案する.提案モデルは,複数の強化学習モジュールとそれらの統合部からなる階層型強化学習器において,進化的手法により,各強化学習学習モジュールヘの部分問題割り当てと,各部分問題に関する状態空間分割を自律的に行う.提案モデルを用いて計算機シミュレーションを行った結果,サブタスクヘの分割と,状態分割に関して妥当な結果が得られた.
抄録(英)	Reinforcement Learning (RL) is a trial-and-error method to learn an action sequence that maximizes cumulative reward through time. Since RL does not need an explicit teacher, it can be applied to various decision-making or planning problems. Many real problems consist of sub-tasks, and the goal is achieved by solving the sub-tasks in a correct order. In order to solve such a problem, we propose in this study a new hierarchical RL method based on state division and an evolutionary computation mechanism. This model is applied to an environment that includes a couple of different sub-tasks. Our computer simulation shows that the proposed method can divide the problem into appropriate sub-tasks and also divide the state space into an appropriate hierarchial structure.
キーワード(和)	階層的強化学習 / 進化的計算 / 状態分割 / 自律的行動選択
キーワード(英)	Hierarchical reinforcement Learning / Evolutionary computation / State division / Autonomous action selection
資料番号	NC2002-232
発行日

研究会情報
研究会	NC
開催期間	2003/3/12(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Neurocomputing (NC)
本文の言語	JPN
タイトル（和）	排他的報酬環境における強化学習エージェントの進化
サブタイトル（和）
タイトル（英）	Evolution of Reinforcement Learning Agents in an Exclusive Rewarding Environment
サブタイトル（和）
キーワード(1)（和/英）	階層的強化学習 / Hierarchical reinforcement Learning
キーワード(2)（和/英）	進化的計算 / Evolutionary computation
キーワード(3)（和/英）	状態分割 / State division
キーワード(4)（和/英）	自律的行動選択 / Autonomous action selection
第 1 著者氏名（和/英）	早川充洋 / Atsuhiro HAYAKAWA
第 1 著者所属（和/英）	奈良先端科学技術大学院大学情報科学研究科 Graduate School of Information Science, Nara Institute of Science and Technology
第 2 著者氏名（和/英）	前川聡 / Satoshi MAEKAWA
第 2 著者所属（和/英）	通信総合研究所 Communications Research Laboratory
第 3 著者氏名（和/英）	石井信 / Shin ISHII
第 3 著者所属（和/英）	奈良先端科学技術大学院大学情報科学研究科 Graduate School of Information Science, Nara Institute of Science and Technology
発表年月日	2003/3/12
資料番号	NC2002-232
巻番号（vol）	vol.102
号番号（no）	731
ページ範囲	pp.-
ページ数	6
発行日