Presentation 2012-11-07
Weighted Likelihood Policy Search
Tsuyoshi UENO, Kohei HAYASHI, Takashi WASHIO, Yoshinobu KAWAHARA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Reinforcement learning (RL) methods based on direct policy search (DPS) have been actively discussed to achieve an efficient approach to complicated Markov decision processes (MDPs). Although they have brought much progress in practical applications of RL, there still remains an open problem in DPS related to model selection for the policy. In this paper, we propose a new DPS method, weighted likelihood policy search (WLPS), where a policy is efficiently learned through the weighted likelihood estimation. WLPS naturally connects DPS to the statistical inference problem and thus various sophisticated techniques in statistics can be applied to DPS problems directly. Hence, by following the idea of the information criterion, we develop a new measurement for model comparison in DPS based on the weighted log-likelihood.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement learning / direct policy search / asymptotic analysis / model selection
Paper # IBISML2012-57
Date of Issue

Conference Information
Committee IBISML
Conference Date 2012/10/31(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Information-Based Induction Sciences and Machine Learning (IBISML)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Weighted Likelihood Policy Search
Sub Title (in English)
Keyword(1) Reinforcement learning
Keyword(2) direct policy search
Keyword(3) asymptotic analysis
Keyword(4) model selection
1st Author's Name Tsuyoshi UENO
1st Author's Affiliation Minato Discrete Structure Manipulation System Project, Japan Science and Technology Agency()
2nd Author's Name Kohei HAYASHI
2nd Author's Affiliation Department of Mathematical Informatics, The University of Tokyo:JSPS
3rd Author's Name Takashi WASHIO
3rd Author's Affiliation The Institute of Scientific and Industrial Research, Osaka University:Minato Discrete Structure Manipulation System Project, Japan Science and Technology Agency
4th Author's Name Yoshinobu KAWAHARA
4th Author's Affiliation The Institute of Scientific and Industrial Research, Osaka University
Date 2012-11-07
Paper # IBISML2012-57
Volume (vol) vol.112
Number (no) 279
Page pp.pp.-
#Pages 6
Date of Issue