Presentation | 2001/6/22 Control of exploration and exploitation in reinforcement learning Wako Yoshida, Shin Ishii, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Reinforcement learning (RL) is a learning framework based on trial-and-error, in which an agent learns by interacting with the environment. Model-based RL calculates the value function by modeling the environment; it is suitable for complicated environments. In this report, we propose a Bayes approximation method for the environmental model. One of the major issues in RL is the balance between exploration for searching for better control, and exploitation for obtaining the large reward. In order to control this contradictory balance, we introduce a control mechanism for the inverse temperature and an exploration bonus term, to the action selection. When our learning method is applied to a two-dimensional maze task, experimental results show taht the learning agent is able to adapt well to changes in the environment. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Model-based reinforcement learning / Bayes inference / exploration-exploitation problem / inverse temperature / exploration bonus |
Paper # | NC2001-28 |
Date of Issue |
Conference Information | |
Committee | NC |
---|---|
Conference Date | 2001/6/22(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Neurocomputing (NC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Control of exploration and exploitation in reinforcement learning |
Sub Title (in English) | |
Keyword(1) | Model-based reinforcement learning |
Keyword(2) | Bayes inference |
Keyword(3) | exploration-exploitation problem |
Keyword(4) | inverse temperature |
Keyword(5) | exploration bonus |
1st Author's Name | Wako Yoshida |
1st Author's Affiliation | Nara Institute of Science and Technology:CREST Doya Project, Japan Science and Technology Corporation() |
2nd Author's Name | Shin Ishii |
2nd Author's Affiliation | Nara Institute of Science and Technology:CREST Doya Project, Japan Science and Technology Corporation |
Date | 2001/6/22 |
Paper # | NC2001-28 |
Volume (vol) | vol.101 |
Number (no) | 154 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |