Presentation | 2000/5/18 Automatic control of continuous systems based on on-line EM reinforcement learning Yoshimoto Junichiro, Ishii Shin, Sato Masa-aki, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this report, we propose a new reinforcement learning(RL)method for continuous dynamical systems by using function approximation and stochastic learning. Our RL method has an architecture like the actor-critic model. The critic tries to approximate the Q-function, which is the expected future return for the current state-action pair. The actor tries to approximate a stochastic soft-max policy defined by the Q-function. The soft-max policy is more likely to select an action that has a higher Q-function value. The on-line EM algorithm is used to train the critic and the actor. We apply this method to two control problems. Computer simulations show that our method is able to acquire faurly good control in the two tasks after a few learning trials. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | reinforcement learning / actor-critic model / continuous dynamical system / stochastic model / EM algorithm |
Paper # | AI2000-5 |
Date of Issue |
Conference Information | |
Committee | AI |
---|---|
Conference Date | 2000/5/18(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Artificial Intelligence and Knowledge-Based Processing (AI) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Automatic control of continuous systems based on on-line EM reinforcement learning |
Sub Title (in English) | |
Keyword(1) | reinforcement learning |
Keyword(2) | actor-critic model |
Keyword(3) | continuous dynamical system |
Keyword(4) | stochastic model |
Keyword(5) | EM algorithm |
1st Author's Name | Yoshimoto Junichiro |
1st Author's Affiliation | Nara Institute of Science and Technology() |
2nd Author's Name | Ishii Shin |
2nd Author's Affiliation | Nara Institute of Science and Technology :CREST, Japan Science and Technology Corporation |
3rd Author's Name | Sato Masa-aki |
3rd Author's Affiliation | ATR International:CREST, Japan Science and Technology Corporation |
Date | 2000/5/18 |
Paper # | AI2000-5 |
Volume (vol) | vol.100 |
Number (no) | 88 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |