Presentation | 2011-09-05 Multiarmed Bandit Algorithms Based on Empirical Moments Junya HONDA, Akimichi TAKEMURA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In the multiarmed bandit problem a gambler chooses an arm of a slot machine to pull considering a tradeoff between exploration and exploitation. We study the stochastic bandit problem where each arm has a reward distribution supported in a known bounded interval, e.g. [0,1]. For this model, there exists a policy which achieves the theoretical bound asymptotically. However the optimal policy requires a computation of a convex optimization which involves the empirical distribution of each arm. In this paper, we propose a policy which exploits the first d empirical moments for arbitrary d fixed in advance. The asymptotic upper bound of the regret of the policy approaches the theoretical bound as d increases. The proposed policy requires a minimization of KL divergence with moment constraints. We show by the theory of Tchebycheff system that the optimal value is obtained by solving polynomial equations. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | multiarmed bandit problem / Tchebycheff system / moment space / divergence minimization |
Paper # | PRMU2011-60,IBISML2011-19 |
Date of Issue |
Conference Information | |
Committee | IBISML |
---|---|
Conference Date | 2011/8/29(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Information-Based Induction Sciences and Machine Learning (IBISML) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Multiarmed Bandit Algorithms Based on Empirical Moments |
Sub Title (in English) | |
Keyword(1) | multiarmed bandit problem |
Keyword(2) | Tchebycheff system |
Keyword(3) | moment space |
Keyword(4) | divergence minimization |
1st Author's Name | Junya HONDA |
1st Author's Affiliation | Graduate School of Frontier Sciences, The University of Tokyo() |
2nd Author's Name | Akimichi TAKEMURA |
2nd Author's Affiliation | Graduate School of Information Science and Technology, The University of Tokyo |
Date | 2011-09-05 |
Paper # | PRMU2011-60,IBISML2011-19 |
Volume (vol) | vol.111 |
Number (no) | 194 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |