LMMによるStudying Papersと内発的報酬を用いた深層強化学習

Presentation	2024-01-18 Deep Reinforcement Learning Using LMM's Studying Papers and Intrinsic Rewards Sota Nagano, Satoshi Yamane,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Research combining deep reinforcement learning with a large language model (LLM) produced high scores even for open-world games with complex tasks. However, LLM cannot handle images that represent the appearance of the game, and natural language is required to describe the state of the environment. Therefore, we propose a deep reinforcement learning method (LMMPaIR) based on a large multimodal model (LMM) that can handle both images and language. We use LMMs to extract information for successful game play from pictures and captions in a paper about the environment, and generate intrinsic rewards from this information. We are currently experimenting with LMMPaIR in combination with the reinforcement learning algorithm PPO in the Crafter.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Machine Learning / Deep Reinforcement Learning / Large Language Model / Large Multimodal Model
Paper #	MSS2023-64,SS2023-43
Date of Issue	2024-01-10 (MSS, SS)

Conference Information
Committee	SS / MSS
Conference Date	2024/1/17(2days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair	Kozo Okano(Shinshu Univ.) / Shingo Yamaguchi(Yamaguchi Univ.)
Vice Chair	Yoshiki Higo(Osaka Univ.) / Toshiyuki Miyamoto(Osaka Inst. of Tech.)
Secretary	Yoshiki Higo(Shinshu Univ.) / Toshiyuki Miyamoto(Tokyo Inst. of Tech.)
Assistant	Shinsuke Matsumoto(Osaka Univ.) / Masato Shirai(Shimane Univ.)

Paper Information
Registration To	Technical Committee on Software Science / Technical Committee on Mathematical Systems Science and its Applications
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Deep Reinforcement Learning Using LMM's Studying Papers and Intrinsic Rewards
Sub Title (in English)
Keyword(1)	Machine Learning
Keyword(2)	Deep Reinforcement Learning
Keyword(3)	Large Language Model
Keyword(4)	Large Multimodal Model
1st Author's Name	Sota Nagano
1st Author's Affiliation	Kanazawa University(Kanazawa Univ.)
2nd Author's Name	Satoshi Yamane
2nd Author's Affiliation	Kanazawa University(Kanazawa Univ.)
Date	2024-01-18
Paper #	MSS2023-64,SS2023-43
Volume (vol)	vol.123
Number (no)	MSS-334,SS-335
Page	pp.pp.70-75(MSS), pp.70-75(SS),
#Pages	6
Date of Issue	2024-01-10 (MSS, SS)