強化学習における典型系列について(NC一般セッション(3))(認識と学習,模倣学習)

講演名	2003/10/16 強化学習における典型系列について(NC一般セッション(3))(認識と学習,模倣学習) 岩田一貴, 池田和司, 酒井英昭,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	本論では強化学習の経験系列において成り立つ漸近等分割性を示す.これは時間ステップ数が十分に大きければ,経験系列の典型集合がほぼ確率1で出現すること,典型集合中の経験系列がほぼ同じ確率で出現すること,典型集合の大きさが条件付きエントロピーの指数関数で与えられることを表している.この性質は強化学習の学習過程を解析するために大変役に立つ.
抄録(英)	In this paper, we show the asymptotic equipartition property on the empirical sequences in reinforcement learning. This states that if the number of time steps is sufficiently large, then the typical set of empirical sequences has probability nearly one, all elements of the typical set are nearly equiprobable, and the size of the typical set is the exponential function of the conditional entropy. The property is very useful to analyze the reinforcement learning process.
キーワード(和)	強化学習 / タイプ理論 / 典型系列 / 漸近等分割性
キーワード(英)	Reinforcement Learning / Type Theory / Typical Sequence / Asymptotic Equipartition Property
資料番号	PRMU2003-123,NC2003-54
発行日

講演論文情報詳細
申込み研究会	Neurocomputing (NC)
本文の言語	JPN
タイトル（和）	強化学習における典型系列について(NC一般セッション(3))(認識と学習,模倣学習)
サブタイトル（和）
タイトル（英）	On the Typical Sequence in Reinforcement Learning
サブタイトル（和）
キーワード(1)（和/英）	強化学習 / Reinforcement Learning
キーワード(2)（和/英）	タイプ理論 / Type Theory
キーワード(3)（和/英）	典型系列 / Typical Sequence
キーワード(4)（和/英）	漸近等分割性 / Asymptotic Equipartition Property
第 1 著者氏名（和/英）	岩田一貴 / Kazunori IWATA
第 1 著者所属（和/英）	京都大学大学院情報学研究科システム科学専攻 Department of Systems Science, Graduate School of Informatics, Kyoto University
第 2 著者氏名（和/英）	池田和司 / Kazushi IKEDA
第 2 著者所属（和/英）	京都大学大学院情報学研究科システム科学専攻 Department of Systems Science, Graduate School of Informatics, Kyoto University
第 3 著者氏名（和/英）	酒井英昭 / Hideaki SAKAI
第 3 著者所属（和/英）	京都大学大学院情報学研究科システム科学専攻 Department of Systems Science, Graduate School of Informatics, Kyoto University
発表年月日	2003/10/16
資料番号	PRMU2003-123,NC2003-54
巻番号（vol）	vol.103
号番号（no）	391
ページ範囲	pp.-
ページ数	6
発行日