Presentation 2007-03-14
Learning of a robust controller for a biped robot based on a sample-reuse reinforcement learning method
Tsuyoshi UENO, Yutaka NAKAMURA, Takashi TAKUMA, Tomohiro SHIBATA, Koh Hosoda, Shin ISHI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Recently, many researchers on humanoid robotics are interested in Quasi-Passive Dynamic Walking (Quasi-PDW), which is similar to human walking. It is desirable that control parameters in Quasi-PDW are automatically adjusted because robots often suffer from changes in their physical parameters and the surrounding environment. Reinforcement learning (RL) can be a key technology to this adaptability, and it has been shown that RL realizes Quasi-PDW in simulation studies. To apply the existing RL method to controlling real robots, however, further improvement to accelerate its learning is required. Otherwise, the robots would break down before acquiring appropriate controller. For this purpose, this study employs an off-policy natural actor-critic (off-NAC) which is able to reuse the samples that has already been obtained. This study also proposes an adaptive method of the learning rate which works with the off-NAC method. Simulation as well as real experiments demonstrate that fast and stable learning of Quasi-PDW of an unstable biped robot can be realized by our proposed method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement Learning / Quasi-Passive Dynamic Walk / Adaptive Control
Paper # NC2006-151
Date of Issue

Conference Information
Committee NC
Conference Date 2007/3/7(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Neurocomputing (NC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Learning of a robust controller for a biped robot based on a sample-reuse reinforcement learning method
Sub Title (in English)
Keyword(1) Reinforcement Learning
Keyword(2) Quasi-Passive Dynamic Walk
Keyword(3) Adaptive Control
1st Author's Name Tsuyoshi UENO
1st Author's Affiliation Department of Information Science, Nara Institute of Science and Technology()
2nd Author's Name Yutaka NAKAMURA
2nd Author's Affiliation Department of Engineering, Osaka University
3rd Author's Name Takashi TAKUMA
3rd Author's Affiliation Department of Engineering, Osaka University
4th Author's Name Tomohiro SHIBATA
4th Author's Affiliation Department of Information Science, Nara Institute of Science and Technology
5th Author's Name Koh Hosoda
5th Author's Affiliation Department of Engineering, Osaka University
6th Author's Name Shin ISHI
6th Author's Affiliation Department of Information Science, Nara Institute of Science and Technology
Date 2007-03-14
Paper # NC2006-151
Volume (vol) vol.106
Number (no) 588
Page pp.pp.-
#Pages 6
Date of Issue