Presentation 2022-06-09
Improvement of Learning Performance by Using a Symmetric Constraint Condition in PPO
Naoki Iwaya, Hidehiro Nakano,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Deep Reinforcement Learning (DRL) is an algorithm of learning the optimal action from the experiences. PPO KL Penalty, a kind of DRL, features suppressing the large update values by KL constraint and preventing wrong recognition, and can save the learning time. However, PPO KL Penalty is unstable because KL divergence has asymmetrical characteristics. This research aims to apply symmetrical constraint to increase learning stability and efficiency.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Deep Reinforcement Learning / Policy gradient method / PPO
Paper # NLP2022-3,CCS2022-3
Date of Issue 2022-06-02 (NLP, CCS)

Conference Information
Committee CCS / NLP
Conference Date 2022/6/9(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Megumi Akai(Hokkaido Univ.) / Akio Tsuneda(Kumamoto Univ.)
Vice Chair Masaki Aida(TMU) / Hidehiro Nakano(Tokyo City Univ.) / Hiroyuki Torikai(Hosei Univ.)
Secretary Masaki Aida(TDK) / Hidehiro Nakano(Shibaura Insti. of Tech.) / Hiroyuki Torikai(Sojo Univ.)
Assistant Tomoyuki Sasaki(Shonan Instit. of Tech.) / Hiroyasu Ando(Tsukuba Univ.) / Miki Kobayashi(Rissho Univ.) / " Hiroyuki YASUDA(The Univ. of Tokyo) / Yuichi Yokoi(Nagasaki Univ.) / Yoshikazu Yamanaka(Utsunomiya Univ.)

Paper Information
Registration To Technical Committee on Complex Communication Sciences / Technical Committee on Nonlinear Problems
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Improvement of Learning Performance by Using a Symmetric Constraint Condition in PPO
Sub Title (in English)
Keyword(1) Deep Reinforcement Learning
Keyword(2) Policy gradient method
Keyword(3) PPO
1st Author's Name Naoki Iwaya
1st Author's Affiliation Tokyo City University(Tokyo City Univ.)
2nd Author's Name Hidehiro Nakano
2nd Author's Affiliation Tokyo City University(Tokyo City Univ.)
Date 2022-06-09
Paper # NLP2022-3,CCS2022-3
Volume (vol) vol.122
Number (no) NLP-65,CCS-66
Page pp.pp.13-16(NLP), pp.13-16(CCS),
#Pages 4
Date of Issue 2022-06-02 (NLP, CCS)