POMDPs環境下での経験強化型強化学習法(「セマンティックWeb」特集及び一般)

Presentation	2004-07-29 The Exploitation Reinforcement Learning Method on POMDPs Wataru UEMURA, Atsushi UENO, Shoji TATSUMI,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	This paper proposes the Episode Profit Sharing(EPS) that can estimate the received rewards on partially observable markov decision processes(POMDPs). EPS equally evaluates all rules in an episode. And EPS distributes the values corresponding to the length of the episode to the rules. We show that EPS can suppress the reinforcement of the detour rules. The experiments show that EPS can get the good performance on both MDPs and POMDPs.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Reinforcement Learning / Profit Sharing / POMDPs / Perceptual Aliasing
Paper #	AI2004-12
Date of Issue

Paper Information
Registration To	Artificial Intelligence and Knowledge-Based Processing (AI)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	The Exploitation Reinforcement Learning Method on POMDPs
Sub Title (in English)
Keyword(1)	Reinforcement Learning
Keyword(2)	Profit Sharing
Keyword(3)	POMDPs
Keyword(4)	Perceptual Aliasing
1st Author's Name	Wataru UEMURA
1st Author's Affiliation	Faculty of Engineering, Osaka City University()
2nd Author's Name	Atsushi UENO
2nd Author's Affiliation	Faculty of Engineering, Osaka City University
3rd Author's Name	Shoji TATSUMI
3rd Author's Affiliation	Faculty of Engineering, Osaka City University
Date	2004-07-29
Paper #	AI2004-12
Volume (vol)	vol.104
Number (no)	233
Page	pp.pp.-
#Pages	5
Date of Issue