Presentation 2009-09-14
Parameter acquisition of an evaluation function for games by reinforcement learning
Yasuhiro TAJIMA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) On finite two-person zero-sum perfect-information games, we can find the best move by minmax search on the game tree with an evaluation function. In this paper, we propose a parameter acquisition method of an evaluation function by Q-learning. In our method, there are three variations of rewards on a state transition: (1) the winning rate of random simulations, (2) the winning rate of the output of UCB1 algorithm, and (3) the winning rate of UCT algorithm. Then, we evaluate the effectiveness of our method in experiments.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Q learning / game tree / evaluation function / k-armed bandit problem
Paper # COMP2009-28
Date of Issue

Conference Information
Committee COMP
Conference Date 2009/9/7(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Theoretical Foundations of Computing (COMP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Parameter acquisition of an evaluation function for games by reinforcement learning
Sub Title (in English)
Keyword(1) Q learning
Keyword(2) game tree
Keyword(3) evaluation function
Keyword(4) k-armed bandit problem
1st Author's Name Yasuhiro TAJIMA
1st Author's Affiliation Okayama Prefectural University Faculty of Information Engineering()
Date 2009-09-14
Paper # COMP2009-28
Volume (vol) vol.109
Number (no) 195
Page pp.pp.-
#Pages 6
Date of Issue