強化学習における典型系列について(NC一般セッション(3))(認識と学習,模倣学習)

Presentation	2003/10/16 On the Typical Sequence in Reinforcement Learning Kazunori IWATA, Kazushi IKEDA, Hideaki SAKAI,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	In this paper, we show the asymptotic equipartition property on the empirical sequences in reinforcement learning. This states that if the number of time steps is sufficiently large, then the typical set of empirical sequences has probability nearly one, all elements of the typical set are nearly equiprobable, and the size of the typical set is the exponential function of the conditional entropy. The property is very useful to analyze the reinforcement learning process.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Reinforcement Learning / Type Theory / Typical Sequence / Asymptotic Equipartition Property
Paper #	PRMU2003-123,NC2003-54
Date of Issue

Paper Information
Registration To	Neurocomputing (NC)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	On the Typical Sequence in Reinforcement Learning
Sub Title (in English)
Keyword(1)	Reinforcement Learning
Keyword(2)	Type Theory
Keyword(3)	Typical Sequence
Keyword(4)	Asymptotic Equipartition Property
1st Author's Name	Kazunori IWATA
1st Author's Affiliation	Department of Systems Science, Graduate School of Informatics, Kyoto University()
2nd Author's Name	Kazushi IKEDA
2nd Author's Affiliation	Department of Systems Science, Graduate School of Informatics, Kyoto University
3rd Author's Name	Hideaki SAKAI
3rd Author's Affiliation	Department of Systems Science, Graduate School of Informatics, Kyoto University
Date	2003/10/16
Paper #	PRMU2003-123,NC2003-54
Volume (vol)	vol.103
Number (no)	391
Page	pp.pp.-
#Pages	6
Date of Issue