Presentation 2024-01-18
Deep Reinforcement Learning Using LMM's Studying Papers and Intrinsic Rewards
Sota Nagano, Satoshi Yamane,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Research combining deep reinforcement learning with a large language model (LLM) produced high scores even for open-world games with complex tasks. However, LLM cannot handle images that represent the appearance of the game, and natural language is required to describe the state of the environment. Therefore, we propose a deep reinforcement learning method (LMMPaIR) based on a large multimodal model (LMM) that can handle both images and language. We use LMMs to extract information for successful game play from pictures and captions in a paper about the environment, and generate intrinsic rewards from this information. We are currently experimenting with LMMPaIR in combination with the reinforcement learning algorithm PPO in the Crafter.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Machine Learning / Deep Reinforcement Learning / Large Language Model / Large Multimodal Model
Paper # MSS2023-64,SS2023-43
Date of Issue 2024-01-10 (MSS, SS)

Conference Information
Committee SS / MSS
Conference Date 2024/1/17(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Kozo Okano(Shinshu Univ.) / Shingo Yamaguchi(Yamaguchi Univ.)
Vice Chair Yoshiki Higo(Osaka Univ.) / Toshiyuki Miyamoto(Osaka Inst. of Tech.)
Secretary Yoshiki Higo(Shinshu Univ.) / Toshiyuki Miyamoto(Tokyo Inst. of Tech.)
Assistant Shinsuke Matsumoto(Osaka Univ.) / Masato Shirai(Shimane Univ.)

Paper Information
Registration To Technical Committee on Software Science / Technical Committee on Mathematical Systems Science and its Applications
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Deep Reinforcement Learning Using LMM's Studying Papers and Intrinsic Rewards
Sub Title (in English)
Keyword(1) Machine Learning
Keyword(2) Deep Reinforcement Learning
Keyword(3) Large Language Model
Keyword(4) Large Multimodal Model
1st Author's Name Sota Nagano
1st Author's Affiliation Kanazawa University(Kanazawa Univ.)
2nd Author's Name Satoshi Yamane
2nd Author's Affiliation Kanazawa University(Kanazawa Univ.)
Date 2024-01-18
Paper # MSS2023-64,SS2023-43
Volume (vol) vol.123
Number (no) MSS-334,SS-335
Page pp.pp.70-75(MSS), pp.70-75(SS),
#Pages 6
Date of Issue 2024-01-10 (MSS, SS)