Presentation 2004/3/12
Reinforcement learning based on a policy gradient method for biped locomotion
Takeshi MORI, Yutaka NAKAMURA, Shin ISHII,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Recently, an actor-critic method utilizing a lower dimensional projection of the value function based on a policy gradient method has been proposed. In this actor-critic method, the approximation of the value function is relatively easy, because the dimension of the projection space is lower than that of the state and action spaces. Then, its applications to real problems such as robot control can be easy. In our previous study, we presented a CPG-actor-critic model, which is a reinforcement learning model based on biological concepts, and applied it to an automatic control problem of a biped robot. In this report, we apply the actor-critic method based on the policy gradient method to the CPG-actor-critic model, and show that our method achieves robust control of the biped robot.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement learning / Policy gradient method / Actor-critic method / Biped locomotion / Central pattern generator
Paper # NC2003-206
Date of Issue

Conference Information
Committee NC
Conference Date 2004/3/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Neurocomputing (NC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Reinforcement learning based on a policy gradient method for biped locomotion
Sub Title (in English)
Keyword(1) Reinforcement learning
Keyword(2) Policy gradient method
Keyword(3) Actor-critic method
Keyword(4) Biped locomotion
Keyword(5) Central pattern generator
1st Author's Name Takeshi MORI
1st Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology()
2nd Author's Name Yutaka NAKAMURA
2nd Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
3rd Author's Name Shin ISHII
3rd Author's Affiliation Graduate School of Information Science, Nara Institute of Science and Technology
Date 2004/3/12
Paper # NC2003-206
Volume (vol) vol.103
Number (no) 734
Page pp.pp.-
#Pages 6
Date of Issue