Presentation | 2012-11-07 Stochastic policy gradient method for a stochastic policy using a Gaussian process regression Yutaka NAKAMURA, Hiroshi ISHIGURO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Reinforcement learning (RL) methods using Gaussian process regression (GP) for approximating the value function have been studied [1]. Thanks to the use of Bayesian reasoning with GPs, the variance of the output can be calculated, but there is no direct benefit by using the variance of the value estimate. In this research, we propose a policy gradient method for a GP based stochastic policy, where the output variance is utilized as the confidence in the action selection. We apply our method to a control task of the swinging up a pendulum, and simulation results show a good controller can be obtained by our method. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Reinforcement learning / Gaussian process regression / policy gradient method / adaptive control |
Paper # | IBISML2012-52 |
Date of Issue |
Conference Information | |
Committee | IBISML |
---|---|
Conference Date | 2012/10/31(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Information-Based Induction Sciences and Machine Learning (IBISML) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Stochastic policy gradient method for a stochastic policy using a Gaussian process regression |
Sub Title (in English) | |
Keyword(1) | Reinforcement learning |
Keyword(2) | Gaussian process regression |
Keyword(3) | policy gradient method |
Keyword(4) | adaptive control |
1st Author's Name | Yutaka NAKAMURA |
1st Author's Affiliation | () |
2nd Author's Name | Hiroshi ISHIGURO |
2nd Author's Affiliation | |
Date | 2012-11-07 |
Paper # | IBISML2012-52 |
Volume (vol) | vol.112 |
Number (no) | 279 |
Page | pp.pp.- |
#Pages | 5 |
Date of Issue |