リワードマシンを用いる強化学習手法の計算性能とタスク難易度の関係

Presentation	2022-03-29 Relationship between Computational Performance and Task Difficulty of Reinforcement Learning Methods Using Reward Machines Ryuji Watanabe, Gouhei Tanaka,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	In reinforcement learning, it is necessary to take into account the history of past state transitions during learning for tasks where the reward is not immediately determined. Reward Machines are a method that divides a task into parts and learns the reward function for each part of the process. Reinforcement learning methods using the reward machine have been shown to provide faster learning speed than conventional methods such as Q-learning and guarantees convergence to the optimal solution. In this report, we conduct numerical experiments on several tasks in a grid-like environment with different number of symbols to acquire a reward, different structures of reward functions, and different settings of the environment, and evaluate the changes in the rate of reward acquisition for each episode. We also discuss the effect of task difficulty on computational performance based on the experimental results.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Reinforcement learning / Non-Markov decision process / Reward Machines
Paper #	MSS2021-70,NLP2021-141
Date of Issue	2022-03-21 (MSS, NLP)

Conference Information
Committee	MSS / NLP
Conference Date	2022/3/28(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)	Online
Topics (in Japanese)	(See Japanese page)
Topics (in English)	MSS, NLP, Work In Progress (MSS only), and etc.
Chair	Atsuo Ozaki(Osaka Inst. of Tech.) / Takuji Kosaka(Chukyo Univ.)
Vice Chair	Shingo Yamaguchi(Yamaguchi Univ.) / Akio Tsuneda(Kumamoto Univ.)
Secretary	Shingo Yamaguchi(Hokkaido Univ.) / Akio Tsuneda(NEC)
Assistant	Masato Shirai(Shimane Univ.) / Hideyuki Kato(Oita Univ.) / Yuichi Yokoi(Nagasaki Univ.)

Paper Information
Registration To	Technical Committee on Mathematical Systems Science and its Applications / Technical Committee on Nonlinear Problems
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Relationship between Computational Performance and Task Difficulty of Reinforcement Learning Methods Using Reward Machines
Sub Title (in English)	*
Keyword(1)	Reinforcement learning
Keyword(2)	Non-Markov decision process
Keyword(3)	Reward Machines
1st Author's Name	Ryuji Watanabe
1st Author's Affiliation	The University of Tokyo(The Univ. of Tokyo)
2nd Author's Name	Gouhei Tanaka
2nd Author's Affiliation	The University of Tokyo(The Univ. of Tokyo)
Date	2022-03-29
Paper #	MSS2021-70,NLP2021-141
Volume (vol)	vol.121
Number (no)	MSS-443,NLP-444
Page	pp.pp.77-82(MSS), pp.77-82(NLP),
#Pages	6
Date of Issue	2022-03-21 (MSS, NLP)