Presentation 2003/7/22
Lempel-Ziv Coding for Measuring Complexity in Reinforcement Learning
Kazunori IWATA, Kazushi IKEDA, Hideaki SAKAI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We describe Markov decision processes as the representation of Markov information sources to uniformly deal with reinforcement learning processes. We then consider an information theoretic analysis of the effects of the domain size and the complexity on the learning process and give a guide of strategy for recognizing the probabilistic structure of Markov decision processes as early as possible. In experimental results, we confirm that early stages of the learning process are mainly characterized by the domain size and as the number of steps increases it depends heavily on the stochastic complexity of Markov decision processes.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Reinforcement Learning / Markov Decision Process / Lempei-Ziv Coding / Domain Size / Stochastic Complexity
Paper # NC2003-43
Date of Issue

Conference Information
Committee NC
Conference Date 2003/7/22(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Neurocomputing (NC)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Lempel-Ziv Coding for Measuring Complexity in Reinforcement Learning
Sub Title (in English)
Keyword(1) Reinforcement Learning
Keyword(2) Markov Decision Process
Keyword(3) Lempei-Ziv Coding
Keyword(4) Domain Size
Keyword(5) Stochastic Complexity
1st Author's Name Kazunori IWATA
1st Author's Affiliation Department of System Science, Graduate School of Informatics, Kyoto University()
2nd Author's Name Kazushi IKEDA
2nd Author's Affiliation Department of System Science, Graduate School of Informatics, Kyoto University
3rd Author's Name Hideaki SAKAI
3rd Author's Affiliation Department of System Science, Graduate School of Informatics, Kyoto University
Date 2003/7/22
Paper # NC2003-43
Volume (vol) vol.103
Number (no) 228
Page pp.pp.-
#Pages 6
Date of Issue