Presentation 2004/6/18
A reinforcement learning for a policy involving value-directed internal state
Yutaka NAKAMURA, Shin ISHII,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) There are many studies on partially observable Markov decision processes, which employ "belief state" that represents the state of the environment, in order to estimate the value function. However, it is often intractable to obtain the value function because the space of belief states is very large. Recently, policy gradient methods that involve value learning have been developed and their efficiency has been shown. In this report, we propose a natural policy gradient method for a policy involving an internal state. Computer simulations show that a good controller which can control a linear dynamical system with unobservable variables is acquired according to our reinforcement learning method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement learning / policy gradient method / natural policy gradient method / partially observable Markov decision process / least squares temporal difference learning
Paper # NC2004-33
Date of Issue

Conference Information
Committee NC
Conference Date 2004/6/18(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Neurocomputing (NC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A reinforcement learning for a policy involving value-directed internal state
Sub Title (in English)
Keyword(1) Reinforcement learning
Keyword(2) policy gradient method
Keyword(3) natural policy gradient method
Keyword(4) partially observable Markov decision process
Keyword(5) least squares temporal difference learning
1st Author's Name Yutaka NAKAMURA
1st Author's Affiliation Nara Institute of Science and Technology, Theoretical Life-Science Laboratory:CREST, Japan Science and Technology Agency()
2nd Author's Name Shin ISHII
2nd Author's Affiliation Nara Institute of Science and Technology, Theoretical Life-Science Laboratory:CREST, Japan Science and Technology Agency
Date 2004/6/18
Paper # NC2004-33
Volume (vol) vol.104
Number (no) 140
Page pp.pp.-
#Pages 6
Date of Issue