サンプル再利用型強化学習による準受動2足歩行ロボットの学習

植野 剛; 中村 泰; 田熊 隆史; 柴田 智広; 細田 耕; 石井 信

Presentation	2007-03-14 Learning of a robust controller for a biped robot based on a sample-reuse reinforcement learning method Tsuyoshi UENO, Yutaka NAKAMURA, Takashi TAKUMA, Tomohiro SHIBATA, Koh Hosoda, Shin ISHI,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Recently, many researchers on humanoid robotics are interested in Quasi-Passive Dynamic Walking (Quasi-PDW), which is similar to human walking. It is desirable that control parameters in Quasi-PDW are automatically adjusted because robots often suffer from changes in their physical parameters and the surrounding environment. Reinforcement learning (RL) can be a key technology to this adaptability, and it has been shown that RL realizes Quasi-PDW in simulation studies. To apply the existing RL method to controlling real robots, however, further improvement to accelerate its learning is required. Otherwise, the robots would break down before acquiring appropriate controller. For this purpose, this study employs an off-policy natural actor-critic (off-NAC) which is able to reuse the samples that has already been obtained. This study also proposes an adaptive method of the learning rate which works with the off-NAC method. Simulation as well as real experiments demonstrate that fast and stable learning of Quasi-PDW of an unstable biped robot can be realized by our proposed method.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Reinforcement Learning / Quasi-Passive Dynamic Walk / Adaptive Control
Paper #	NC2006-151
Date of Issue

Conference Information
Committee	NC
Conference Date	2007/3/7(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Neurocomputing (NC)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Learning of a robust controller for a biped robot based on a sample-reuse reinforcement learning method
Sub Title (in English)
Keyword(1)	Reinforcement Learning
Keyword(2)	Quasi-Passive Dynamic Walk
Keyword(3)	Adaptive Control
1st Author's Name	Tsuyoshi UENO
1st Author's Affiliation	Department of Information Science, Nara Institute of Science and Technology()
2nd Author's Name	Yutaka NAKAMURA
2nd Author's Affiliation	Department of Engineering, Osaka University
3rd Author's Name	Takashi TAKUMA
3rd Author's Affiliation	Department of Engineering, Osaka University
4th Author's Name	Tomohiro SHIBATA
4th Author's Affiliation	Department of Information Science, Nara Institute of Science and Technology
5th Author's Name	Koh Hosoda
5th Author's Affiliation	Department of Engineering, Osaka University
6th Author's Name	Shin ISHI
6th Author's Affiliation	Department of Information Science, Nara Institute of Science and Technology
Date	2007-03-14
Paper #	NC2006-151
Volume (vol)	vol.106
Number (no)	588
Page	pp.pp.-
#Pages	6
Date of Issue