Paper Abstract and Keywords |
Presentation |
2022-03-29 10:05
Relationship between Computational Performance and Task Difficulty of Reinforcement Learning Methods Using Reward Machines Ryuji Watanabe, Gouhei Tanaka (The Univ. of Tokyo) MSS2021-70 NLP2021-141 |
Abstract |
(in Japanese) |
(See Japanese page) |
(in English) |
In reinforcement learning, it is necessary to take into account the history of past state transitions during learning for tasks where the reward is not immediately determined. Reward Machines are a method that divides a task into parts and learns the reward function for each part of the process. Reinforcement learning methods using the reward machine have been shown to provide faster learning speed than conventional methods such as Q-learning and guarantees convergence to the optimal solution.
In this report, we conduct numerical experiments on several tasks in a grid-like environment with different number of symbols to acquire a reward, different structures of reward functions, and different settings of the environment, and evaluate the changes in the rate of reward acquisition for each episode. We also discuss the effect of task difficulty on computational performance based on the experimental results. |
Keyword |
(in Japanese) |
(See Japanese page) |
(in English) |
Reinforcement learning / Non-Markov decision process / Reward Machines / / / / / |
Reference Info. |
IEICE Tech. Rep., vol. 121, no. 444, NLP2021-141, pp. 77-82, March 2022. |
Paper # |
NLP2021-141 |
Date of Issue |
2022-03-21 (MSS, NLP) |
ISSN |
Online edition: ISSN 2432-6380 |
Copyright and reproduction |
All rights are reserved and no part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Notwithstanding, instructors are permitted to photocopy isolated articles for noncommercial classroom use without fee. (License No.: 10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
Download PDF |
MSS2021-70 NLP2021-141 |
Conference Information |
Committee |
MSS NLP |
Conference Date |
2022-03-28 - 2022-03-29 |
Place (in Japanese) |
(See Japanese page) |
Place (in English) |
Online |
Topics (in Japanese) |
(See Japanese page) |
Topics (in English) |
MSS, NLP, Work In Progress (MSS only), and etc. |
Paper Information |
Registration To |
NLP |
Conference Code |
2022-03-MSS-NLP |
Language |
Japanese |
Title (in Japanese) |
(See Japanese page) |
Sub Title (in Japanese) |
(See Japanese page) |
Title (in English) |
Relationship between Computational Performance and Task Difficulty of Reinforcement Learning Methods Using Reward Machines |
Sub Title (in English) |
* |
Keyword(1) |
Reinforcement learning |
Keyword(2) |
Non-Markov decision process |
Keyword(3) |
Reward Machines |
Keyword(4) |
|
Keyword(5) |
|
Keyword(6) |
|
Keyword(7) |
|
Keyword(8) |
|
1st Author's Name |
Ryuji Watanabe |
1st Author's Affiliation |
The University of Tokyo (The Univ. of Tokyo) |
2nd Author's Name |
Gouhei Tanaka |
2nd Author's Affiliation |
The University of Tokyo (The Univ. of Tokyo) |
3rd Author's Name |
|
3rd Author's Affiliation |
() |
4th Author's Name |
|
4th Author's Affiliation |
() |
5th Author's Name |
|
5th Author's Affiliation |
() |
6th Author's Name |
|
6th Author's Affiliation |
() |
7th Author's Name |
|
7th Author's Affiliation |
() |
8th Author's Name |
|
8th Author's Affiliation |
() |
9th Author's Name |
|
9th Author's Affiliation |
() |
10th Author's Name |
|
10th Author's Affiliation |
() |
11th Author's Name |
|
11th Author's Affiliation |
() |
12th Author's Name |
|
12th Author's Affiliation |
() |
13th Author's Name |
|
13th Author's Affiliation |
() |
14th Author's Name |
|
14th Author's Affiliation |
() |
15th Author's Name |
|
15th Author's Affiliation |
() |
16th Author's Name |
|
16th Author's Affiliation |
() |
17th Author's Name |
|
17th Author's Affiliation |
() |
18th Author's Name |
|
18th Author's Affiliation |
() |
19th Author's Name |
|
19th Author's Affiliation |
() |
20th Author's Name |
|
20th Author's Affiliation |
() |
Speaker |
Author-1 |
Date Time |
2022-03-29 10:05:00 |
Presentation Time |
25 minutes |
Registration for |
NLP |
Paper # |
MSS2021-70, NLP2021-141 |
Volume (vol) |
vol.121 |
Number (no) |
no.443(MSS), no.444(NLP) |
Page |
pp.77-82 |
#Pages |
6 |
Date of Issue |
2022-03-21 (MSS, NLP) |
|