IEICE Technical Committee Submission System
Conference Paper's Information
Online Proceedings
[Sign in]
Tech. Rep. Archives
 Go Top Page Go Previous   [Japanese] / [English] 

Paper Abstract and Keywords
Presentation 2007-03-14 15:30
A Study of Policy-Gradient Methods in Non-Markov Decision Porcesses -- Application to a Curling Game --
Harukazu Igarashi (Shibaura Inst. Tech.), Seiji Ishihara (Kinki Univ.), Masaomi Kimura (Shibaura Inst. Tech.)
Abstract (in Japanese) (See Japanese page) 
(in English) There are two approaches to reinforcement learning: value-based methods and policy-gradient methods. Baird and Moore proposed the VAPS algorithm to unify these two approaches. In a previous paper, we gave a simple proof in which the VAPS algorithm's learning rule can be extensively applied even in non-Markov decision processes and clarified statistical properties on the correlation between characteristic eligibility functions. In this paper, we investigate an inverse problem in a curling game and show that the problem can be formalized to a learning problem in non-Markov decision processes.
Keyword (in Japanese) (See Japanese page) 
(in English) Reinforcement learning / Non-Markov decision process / Policy-gradient method / / / / /  
Reference Info. IEICE Tech. Rep., vol. 106, no. 588, NC2006-148, pp. 179-184, March 2007.
Paper # NC2006-148 
Date of Issue 2007-03-07 (NC) 
ISSN Print edition: ISSN 0913-5685
Download PDF

Conference Information
Committee NC  
Conference Date 2007-03-14 - 2007-03-16 
Place (in Japanese) (See Japanese page) 
Place (in English) Tamagawa University 
Topics (in Japanese) (See Japanese page) 
Topics (in English) General 
Paper Information
Registration To NC 
Conference Code 2007-03-NC 
Language Japanese 
Title (in Japanese) (See Japanese page) 
Sub Title (in Japanese) (See Japanese page) 
Title (in English) A Study of Policy-Gradient Methods in Non-Markov Decision Porcesses 
Sub Title (in English) Application to a Curling Game 
Keyword(1) Reinforcement learning  
Keyword(2) Non-Markov decision process  
Keyword(3) Policy-gradient method  
Keyword(4)  
Keyword(5)  
Keyword(6)  
Keyword(7)  
Keyword(8)  
1st Author's Name Harukazu Igarashi  
1st Author's Affiliation Shibaura Institute of Technology (Shibaura Inst. Tech.)
2nd Author's Name Seiji Ishihara  
2nd Author's Affiliation Kinki University (Kinki Univ.)
3rd Author's Name Masaomi Kimura  
3rd Author's Affiliation Shibaura Institute of Technology (Shibaura Inst. Tech.)
4th Author's Name  
4th Author's Affiliation ()
5th Author's Name  
5th Author's Affiliation ()
6th Author's Name  
6th Author's Affiliation ()
7th Author's Name  
7th Author's Affiliation ()
8th Author's Name  
8th Author's Affiliation ()
9th Author's Name  
9th Author's Affiliation ()
10th Author's Name  
10th Author's Affiliation ()
11th Author's Name  
11th Author's Affiliation ()
12th Author's Name  
12th Author's Affiliation ()
13th Author's Name  
13th Author's Affiliation ()
14th Author's Name  
14th Author's Affiliation ()
15th Author's Name  
15th Author's Affiliation ()
16th Author's Name  
16th Author's Affiliation ()
17th Author's Name  
17th Author's Affiliation ()
18th Author's Name  
18th Author's Affiliation ()
19th Author's Name  
19th Author's Affiliation ()
20th Author's Name  
20th Author's Affiliation ()
Speaker Author-1 
Date Time 2007-03-14 15:30:00 
Presentation Time 20 minutes 
Registration for NC 
Paper # NC2006-148 
Volume (vol) vol.106 
Number (no) no.588 
Page pp.179-184 
#Pages
Date of Issue 2007-03-07 (NC) 


[Return to Top Page]

[Return to IEICE Web Page]


The Institute of Electronics, Information and Communication Engineers (IEICE), Japan