Presentation 2005/3/7
Stochastic Policy Representation Using a Multidimensional Normal Distribution for Actor-critic Methods
Satoshi ABE, Atsushi UENO, Masatsugu KIDODE,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Actor-critic methods, which is one of reinforcement learning methods, is applied to that problems easily, and has left many achievements. Generaly, normal distribution has been used as probability distribution on which agent selects action. Agent renews means and standard deviation through policy parameter for selecting appropriate action intercting with environment. Under assumption that output dimensions are individual, conventional methods use normal distribution. Problems, such as trajectory planning of manupulator, and robot walking control etc., every output must cooperate with each other. Conventional methods cannot make consideration correlation, so it takes long time to get policy selecting action cooperately and being high performance. In this paper, we aim that learning speed up and improvement performance by adopting multivariate normal distribution with variance and covariance matrix into probability distribution selecting action. we have some experiments to demonstrate availability of this method by trajectory planning of manipulator.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) reinforcement learning / actor-critic methods / multidimensional normal distribution / manipulator
Paper # AI2004-72
Date of Issue

Conference Information
Committee AI
Conference Date 2005/3/7(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Artificial Intelligence and Knowledge-Based Processing (AI)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Stochastic Policy Representation Using a Multidimensional Normal Distribution for Actor-critic Methods
Sub Title (in English)
Keyword(1) reinforcement learning
Keyword(2) actor-critic methods
Keyword(3) multidimensional normal distribution
Keyword(4) manipulator
1st Author's Name Satoshi ABE
1st Author's Affiliation Nara Institute of Science and Technology()
2nd Author's Name Atsushi UENO
2nd Author's Affiliation Osaka City University
3rd Author's Name Masatsugu KIDODE
3rd Author's Affiliation Nara Institute of Science and Technology
Date 2005/3/7
Paper # AI2004-72
Volume (vol) vol.104
Number (no) 726
Page pp.pp.-
#Pages 6
Date of Issue