Presentation 2017-03-13
Estimation of the change of agent's behavior strategy using state-action history
Shihori Uchida, Shigeyuki Oba, Shin Ishii,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Reinforcement learning (RL) is a model of learning process of animals and intelligent agents to obtain the optimal behavioral policy based on interactions with unknown environments. Inverse reinforcement learning (IRL) is its opposite, in which the characteristics like reward function of the RL agent are estimated based on the history of the agent's behaviors. In the uncertain environment, the RL agent needs to balance between the currently good behavioral policy (exploitation) and an exploration policy for resolving the uncertainty of the environment (exploration). The existing IRL methods were not appropriate to identify the RL agent's characteristics when it is taking a mixed strategy performing exploitation and exploration depending on its situation. In this study, we proposed a new IRL method that enabled dissociation of different behavioral policies but with the common reward function. Our computer simulation showed that, our method successfully identifies not only the timing of the policy change, but also the other RL parameters like behavioral randomness and the common reward function, only from the agent's behaviors.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement learning / Inverse reinforcement learning / Behavior strategy
Paper # NC2016-65
Date of Issue 2017-03-06 (NC)

Conference Information
Committee MBE / NC
Conference Date 2017/3/13(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Kikai-Shinko-Kaikan Bldg.
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Yutaka Fukuoka(Kogakuin Univ.) / Shigeo Sato(Tohoku Univ.)
Vice Chair Kazuki Nakajima(Univ. of Toyama) / Masafumi Hagiwara(Keio Univ.)
Secretary Kazuki Nakajima(Kogakuin Univ.) / Masafumi Hagiwara(Toyama Pref. Univ.)
Assistant Ryota Horie(Shibaura Inst. of Tech.) / Kim Juhyon(Univ. of Toyama) / Hisanao Akima(Tohoku Univ.) / Yoshihisa Shinozawa(Keio Univ.)

Paper Information
Registration To Technical Committee on ME and Bio Cybernetics / Technical Committee on Neurocomputing
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Estimation of the change of agent's behavior strategy using state-action history
Sub Title (in English)
Keyword(1) Reinforcement learning
Keyword(2) Inverse reinforcement learning
Keyword(3) Behavior strategy
1st Author's Name Shihori Uchida
1st Author's Affiliation Kyoto University(Kyoto Univ.)
2nd Author's Name Shigeyuki Oba
2nd Author's Affiliation Kyoto University(Kyoto Univ.)
3rd Author's Name Shin Ishii
3rd Author's Affiliation Kyoto University(Kyoto Univ.)
Date 2017-03-13
Paper # NC2016-65
Volume (vol) vol.116
Number (no) NC-521
Page pp.pp.7-12(NC),
#Pages 6
Date of Issue 2017-03-06 (NC)