Presentation | 2012-03-12 Kernel Bellman Equations in POMDPs Yu NISHIYAMA, Abdeslam BOULARIAS, Arthur GRETTON, Kenji FUKUMIZU, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We propose to handle POMDPs in reproducing kernel Hilbert spaces (RKHSs) using recent kernel methods of embedding distributions in an RKHS and kernel Bayes rule (KBR). We embed Bellman equations to equations over an RKHS on the set of states and define value functions as functions over the RKHS. We then learn a policy as a mapping from elements of the RKHS to actions. We give empirical representations of the Bellman equations and use it for value iteration algorithms with the kernel Bayes rule. In experiments, we demonstrate the kernel value iteration with finite horizons on some benchmarks in POMDPs and show that the policy converged to the policy that the exact model computed. We also propose QMDP approximations with this kernel methods for the setting of initial values and for pruning action edges to make computationally efficient. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | POMDP / RKHS / Distribution Embeddings / Value Iteration / QMDP |
Paper # | IBISML2011-92 |
Date of Issue |
Conference Information | |
Committee | IBISML |
---|---|
Conference Date | 2012/3/5(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Information-Based Induction Sciences and Machine Learning (IBISML) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Kernel Bellman Equations in POMDPs |
Sub Title (in English) | |
Keyword(1) | POMDP |
Keyword(2) | RKHS |
Keyword(3) | Distribution Embeddings |
Keyword(4) | Value Iteration |
Keyword(5) | QMDP |
1st Author's Name | Yu NISHIYAMA |
1st Author's Affiliation | Center for Statistical Machine Learning Research, The Institute of Statistical Mathematics() |
2nd Author's Name | Abdeslam BOULARIAS |
2nd Author's Affiliation | Max Planck Institute for Intelligent Systems |
3rd Author's Name | Arthur GRETTON |
3rd Author's Affiliation | Gatsby Unit, UCL:Max Planck Institute for Intelligent Systems |
4th Author's Name | Kenji FUKUMIZU |
4th Author's Affiliation | Center for Statistical Machine Learning Research, The Institute of Statistical Mathematics |
Date | 2012-03-12 |
Paper # | IBISML2011-92 |
Volume (vol) | vol.111 |
Number (no) | 480 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |