Presentation 2012-11-07
Stochastic policy gradient method for a stochastic policy using a Gaussian process regression
Yutaka NAKAMURA, Hiroshi ISHIGURO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Reinforcement learning (RL) methods using Gaussian process regression (GP) for approximating the value function have been studied [1]. Thanks to the use of Bayesian reasoning with GPs, the variance of the output can be calculated, but there is no direct benefit by using the variance of the value estimate. In this research, we propose a policy gradient method for a GP based stochastic policy, where the output variance is utilized as the confidence in the action selection. We apply our method to a control task of the swinging up a pendulum, and simulation results show a good controller can be obtained by our method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement learning / Gaussian process regression / policy gradient method / adaptive control
Paper # IBISML2012-52
Date of Issue

Conference Information
Committee IBISML
Conference Date 2012/10/31(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Information-Based Induction Sciences and Machine Learning (IBISML)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Stochastic policy gradient method for a stochastic policy using a Gaussian process regression
Sub Title (in English)
Keyword(1) Reinforcement learning
Keyword(2) Gaussian process regression
Keyword(3) policy gradient method
Keyword(4) adaptive control
1st Author's Name Yutaka NAKAMURA
1st Author's Affiliation ()
2nd Author's Name Hiroshi ISHIGURO
2nd Author's Affiliation
Date 2012-11-07
Paper # IBISML2012-52
Volume (vol) vol.112
Number (no) 279
Page pp.pp.-
#Pages 5
Date of Issue