Actor-critic 法における共分散を考慮した多次元正規分布による政策表現(一般(プランニングと意思決定), 「社会システムにおける知能」及び一般)

阿部 哲; 上野 敦志; 木戸出 正継

Presentation	2005/3/7 Stochastic Policy Representation Using a Multidimensional Normal Distribution for Actor-critic Methods Satoshi ABE, Atsushi UENO, Masatsugu KIDODE,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Actor-critic methods, which is one of reinforcement learning methods, is applied to that problems easily, and has left many achievements. Generaly, normal distribution has been used as probability distribution on which agent selects action. Agent renews means and standard deviation through policy parameter for selecting appropriate action intercting with environment. Under assumption that output dimensions are individual, conventional methods use normal distribution. Problems, such as trajectory planning of manupulator, and robot walking control etc., every output must cooperate with each other. Conventional methods cannot make consideration correlation, so it takes long time to get policy selecting action cooperately and being high performance. In this paper, we aim that learning speed up and improvement performance by adopting multivariate normal distribution with variance and covariance matrix into probability distribution selecting action. we have some experiments to demonstrate availability of this method by trajectory planning of manipulator.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	reinforcement learning / actor-critic methods / multidimensional normal distribution / manipulator
Paper #	AI2004-72
Date of Issue

Conference Information
Committee	AI
Conference Date	2005/3/7(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Artificial Intelligence and Knowledge-Based Processing (AI)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Stochastic Policy Representation Using a Multidimensional Normal Distribution for Actor-critic Methods
Sub Title (in English)
Keyword(1)	reinforcement learning
Keyword(2)	actor-critic methods
Keyword(3)	multidimensional normal distribution
Keyword(4)	manipulator
1st Author's Name	Satoshi ABE
1st Author's Affiliation	Nara Institute of Science and Technology()
2nd Author's Name	Atsushi UENO
2nd Author's Affiliation	Osaka City University
3rd Author's Name	Masatsugu KIDODE
3rd Author's Affiliation	Nara Institute of Science and Technology
Date	2005/3/7
Paper #	AI2004-72
Volume (vol)	vol.104
Number (no)	726
Page	pp.pp.-
#Pages	6
Date of Issue