強化学習によるゲームの評価関数の獲得

Presentation	2009-09-14 Parameter acquisition of an evaluation function for games by reinforcement learning Yasuhiro TAJIMA,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	On finite two-person zero-sum perfect-information games, we can find the best move by minmax search on the game tree with an evaluation function. In this paper, we propose a parameter acquisition method of an evaluation function by Q-learning. In our method, there are three variations of rewards on a state transition: (1) the winning rate of random simulations, (2) the winning rate of the output of UCB1 algorithm, and (3) the winning rate of UCT algorithm. Then, we evaluate the effectiveness of our method in experiments.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Q learning / game tree / evaluation function / k-armed bandit problem
Paper #	COMP2009-28
Date of Issue

Paper Information
Registration To	Theoretical Foundations of Computing (COMP)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Parameter acquisition of an evaluation function for games by reinforcement learning
Sub Title (in English)
Keyword(1)	Q learning
Keyword(2)	game tree
Keyword(3)	evaluation function
Keyword(4)	k-armed bandit problem
1st Author's Name	Yasuhiro TAJIMA
1st Author's Affiliation	Okayama Prefectural University Faculty of Information Engineering()
Date	2009-09-14
Paper #	COMP2009-28
Volume (vol)	vol.109
Number (no)	195
Page	pp.pp.-
#Pages	6
Date of Issue