Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation

Presentation	2007-12-22 Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation Hirotaka HACHIYA, Takayuki AKIYAMA, Masashi SUGIYAMA,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past. A common approach is to use importance sampling techniques for compensating for the bias caused by the difference between data-collecting policies and the target policy. However, existing off-policy methods do not often take the variance of value function estimators explicitly into account and therefore their performance tends to be unstable. To cope with this problem, we propose using an adaptive importance sampling technique which allows us to actively control the trade-off between bias and variance. We further provide a method for optimally determining the trade-off parameter based on a statistical machine learning theory.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Off-policy Reinforcement learning / Value function approximation / Importance sampling
Paper #	NC2007-84
Date of Issue

Paper Information
Registration To	Neurocomputing (NC)
Language	ENG
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation
Sub Title (in English)
Keyword(1)	Off-policy Reinforcement learning
Keyword(2)	Value function approximation
Keyword(3)	Importance sampling
1st Author's Name	Hirotaka HACHIYA
1st Author's Affiliation	Department of Computer Science, Tokyo Institute of Technology()
2nd Author's Name	Takayuki AKIYAMA
2nd Author's Affiliation	Department of Computer Science, Tokyo Institute of Technology
3rd Author's Name	Masashi SUGIYAMA
3rd Author's Affiliation	Department of Computer Science, Tokyo Institute of Technology
Date	2007-12-22
Paper #	NC2007-84
Volume (vol)	vol.107
Number (no)	410
Page	pp.pp.-
#Pages	6
Date of Issue