Presentation 2004-07-29
The Exploitation Reinforcement Learning Method on POMDPs
Wataru UEMURA, Atsushi UENO, Shoji TATSUMI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper proposes the Episode Profit Sharing(EPS) that can estimate the received rewards on partially observable markov decision processes(POMDPs). EPS equally evaluates all rules in an episode. And EPS distributes the values corresponding to the length of the episode to the rules. We show that EPS can suppress the reinforcement of the detour rules. The experiments show that EPS can get the good performance on both MDPs and POMDPs.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement Learning / Profit Sharing / POMDPs / Perceptual Aliasing
Paper # AI2004-12
Date of Issue

Conference Information
Committee AI
Conference Date 2004/7/22(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Artificial Intelligence and Knowledge-Based Processing (AI)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) The Exploitation Reinforcement Learning Method on POMDPs
Sub Title (in English)
Keyword(1) Reinforcement Learning
Keyword(2) Profit Sharing
Keyword(3) POMDPs
Keyword(4) Perceptual Aliasing
1st Author's Name Wataru UEMURA
1st Author's Affiliation Faculty of Engineering, Osaka City University()
2nd Author's Name Atsushi UENO
2nd Author's Affiliation Faculty of Engineering, Osaka City University
3rd Author's Name Shoji TATSUMI
3rd Author's Affiliation Faculty of Engineering, Osaka City University
Date 2004-07-29
Paper # AI2004-12
Volume (vol) vol.104
Number (no) 233
Page pp.pp.-
#Pages 5
Date of Issue