Presentation 2003/10/16
On the Typical Sequence in Reinforcement Learning
Kazunori IWATA, Kazushi IKEDA, Hideaki SAKAI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we show the asymptotic equipartition property on the empirical sequences in reinforcement learning. This states that if the number of time steps is sufficiently large, then the typical set of empirical sequences has probability nearly one, all elements of the typical set are nearly equiprobable, and the size of the typical set is the exponential function of the conditional entropy. The property is very useful to analyze the reinforcement learning process.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement Learning / Type Theory / Typical Sequence / Asymptotic Equipartition Property
Paper # PRMU2003-123,NC2003-54
Date of Issue

Conference Information
Committee NC
Conference Date 2003/10/16(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Neurocomputing (NC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) On the Typical Sequence in Reinforcement Learning
Sub Title (in English)
Keyword(1) Reinforcement Learning
Keyword(2) Type Theory
Keyword(3) Typical Sequence
Keyword(4) Asymptotic Equipartition Property
1st Author's Name Kazunori IWATA
1st Author's Affiliation Department of Systems Science, Graduate School of Informatics, Kyoto University()
2nd Author's Name Kazushi IKEDA
2nd Author's Affiliation Department of Systems Science, Graduate School of Informatics, Kyoto University
3rd Author's Name Hideaki SAKAI
3rd Author's Affiliation Department of Systems Science, Graduate School of Informatics, Kyoto University
Date 2003/10/16
Paper # PRMU2003-123,NC2003-54
Volume (vol) vol.103
Number (no) 391
Page pp.pp.-
#Pages 6
Date of Issue