Gaussian process regressionを用いた確率的方策に対する方策勾配法(第15回情報論的学習理論ワークショップ)

Presentation	2012-11-07 Stochastic policy gradient method for a stochastic policy using a Gaussian process regression Yutaka NAKAMURA, Hiroshi ISHIGURO,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Reinforcement learning (RL) methods using Gaussian process regression (GP) for approximating the value function have been studied [1]. Thanks to the use of Bayesian reasoning with GPs, the variance of the output can be calculated, but there is no direct benefit by using the variance of the value estimate. In this research, we propose a policy gradient method for a GP based stochastic policy, where the output variance is utilized as the confidence in the action selection. We apply our method to a control task of the swinging up a pendulum, and simulation results show a good controller can be obtained by our method.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Reinforcement learning / Gaussian process regression / policy gradient method / adaptive control
Paper #	IBISML2012-52
Date of Issue

Paper Information
Registration To	Information-Based Induction Sciences and Machine Learning (IBISML)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Stochastic policy gradient method for a stochastic policy using a Gaussian process regression
Sub Title (in English)
Keyword(1)	Reinforcement learning
Keyword(2)	Gaussian process regression
Keyword(3)	policy gradient method
Keyword(4)	adaptive control
1st Author's Name	Yutaka NAKAMURA
1st Author's Affiliation	()
2nd Author's Name	Hiroshi ISHIGURO
2nd Author's Affiliation
Date	2012-11-07
Paper #	IBISML2012-52
Volume (vol)	vol.112
Number (no)	279
Page	pp.pp.-
#Pages	5
Date of Issue