回避行動の再利用メカニズムを備えた強化学習のための関数近似器修正手法と多関節ロボットへの応用

Presentation	2007-12-22 A Modification Algorithm of Function Approximator for the Reinforcement Learning with Reusing Mechanism of Avoidance Actions : Proposal and its Application to Motion Learning of Multi-Link Robot Akihiko YAMAGUCHI, Norikazu SUGIMOTO, Mitsuo KAWATO,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Applying a learning method, such as reinforcement learning, to learning motions of multi-link robots requires large cost, such as damage from falling down. To overcome this problem, we proposed a reusing mechanism for reinforcement learning where the avoidance actions, such as not to fall down, are learned separately from primary actions, then they are reused in learning new tasks [1]. A method to apply it to learning whole-body motions of 4-link robot whose base is not fixed to a ground was also developed. In this paper, we propose a new method to modify basis functions of a function approximator of an action value function to improve the separative performance, and demonstrate the method works effectively in learning whole-body motions of a multi-link robot. Furthermore, we investigate a learning cost of damage from falling down in learning whole-body motions is reduced by reusing avoidance actions.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	motion learning / reinforcement learning / reusing / avoidance actions / jumpping / serve
Paper #	NC2007-86
Date of Issue

Paper Information
Registration To	Neurocomputing (NC)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	A Modification Algorithm of Function Approximator for the Reinforcement Learning with Reusing Mechanism of Avoidance Actions : Proposal and its Application to Motion Learning of Multi-Link Robot
Sub Title (in English)
Keyword(1)	motion learning
Keyword(2)	reinforcement learning
Keyword(3)	reusing
Keyword(4)	avoidance actions
Keyword(5)	jumpping
Keyword(6)	serve
1st Author's Name	Akihiko YAMAGUCHI
1st Author's Affiliation	Nara Institute of Science and Technology:ATR Computational Neuroscience Laboratories()
2nd Author's Name	Norikazu SUGIMOTO
2nd Author's Affiliation	ATR Computational Neuroscience Laboratories
3rd Author's Name	Mitsuo KAWATO
3rd Author's Affiliation	ATR Computational Neuroscience Laboratories:Nara Institute of Science and Technology
Date	2007-12-22
Paper #	NC2007-86
Volume (vol)	vol.107
Number (no)	410
Page	pp.pp.-
#Pages	6
Date of Issue