Presentation | 2012-11-07 Weighted Likelihood Policy Search Tsuyoshi UENO, Kohei HAYASHI, Takashi WASHIO, Yoshinobu KAWAHARA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Reinforcement learning (RL) methods based on direct policy search (DPS) have been actively discussed to achieve an efficient approach to complicated Markov decision processes (MDPs). Although they have brought much progress in practical applications of RL, there still remains an open problem in DPS related to model selection for the policy. In this paper, we propose a new DPS method, weighted likelihood policy search (WLPS), where a policy is efficiently learned through the weighted likelihood estimation. WLPS naturally connects DPS to the statistical inference problem and thus various sophisticated techniques in statistics can be applied to DPS problems directly. Hence, by following the idea of the information criterion, we develop a new measurement for model comparison in DPS based on the weighted log-likelihood. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Reinforcement learning / direct policy search / asymptotic analysis / model selection |
Paper # | IBISML2012-57 |
Date of Issue |
Conference Information | |
Committee | IBISML |
---|---|
Conference Date | 2012/10/31(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Information-Based Induction Sciences and Machine Learning (IBISML) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Weighted Likelihood Policy Search |
Sub Title (in English) | |
Keyword(1) | Reinforcement learning |
Keyword(2) | direct policy search |
Keyword(3) | asymptotic analysis |
Keyword(4) | model selection |
1st Author's Name | Tsuyoshi UENO |
1st Author's Affiliation | Minato Discrete Structure Manipulation System Project, Japan Science and Technology Agency() |
2nd Author's Name | Kohei HAYASHI |
2nd Author's Affiliation | Department of Mathematical Informatics, The University of Tokyo:JSPS |
3rd Author's Name | Takashi WASHIO |
3rd Author's Affiliation | The Institute of Scientific and Industrial Research, Osaka University:Minato Discrete Structure Manipulation System Project, Japan Science and Technology Agency |
4th Author's Name | Yoshinobu KAWAHARA |
4th Author's Affiliation | The Institute of Scientific and Industrial Research, Osaka University |
Date | 2012-11-07 |
Paper # | IBISML2012-57 |
Volume (vol) | vol.112 |
Number (no) | 279 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |