Presentation | 2004/3/12 A belief-state reinforcement learning scheme for a multi-agent card game Hajime FUJITA, Shin ISHII, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this report, we deal with the card game "Hearts", an instance of decision making problems in partially observable situations. We present the state estimation method based on a sampling method, and a reinforcement learning (RL) scheme for a multi-agent environment using the state estimation method. Since there are often a lot of unobservable cards in this game, RL is dealt with in the framework of a partially observable Markov decision process (POMDP). Using pessimistic observations, the learning agent focuses on an important domain of a large state space, estimates unobservable states based on a sampling method from such a subspace, and makes a decision by predicting the environmental behavior. Simulation results show that our model-based POMDP-RL method with sampling state estimation is applicable to this realistic multi-agent problem. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | sampling method / POMDP / estimation of unobservable state variables / multi-agent / reinforcement learning (RL) |
Paper # | NC2003-205 |
Date of Issue |
Conference Information | |
Committee | NC |
---|---|
Conference Date | 2004/3/12(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Neurocomputing (NC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A belief-state reinforcement learning scheme for a multi-agent card game |
Sub Title (in English) | |
Keyword(1) | sampling method |
Keyword(2) | POMDP |
Keyword(3) | estimation of unobservable state variables |
Keyword(4) | multi-agent |
Keyword(5) | reinforcement learning (RL) |
1st Author's Name | Hajime FUJITA |
1st Author's Affiliation | Nara Institute of Science and Technology() |
2nd Author's Name | Shin ISHII |
2nd Author's Affiliation | CREST, Japan Science and Technology Agency |
Date | 2004/3/12 |
Paper # | NC2003-205 |
Volume (vol) | vol.103 |
Number (no) | 734 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |