最終段ミニマックスアルゴリズム

瀧本 英二 /

講演名	2003/11/18 最終段ミニマックスアルゴリズム瀧本英二 /,
PDFダウンロードページ	PDFダウンロードページへ
抄録(和)	指数分布族のオンライン型の分布パラメータ推定問題について考える.各時刻t=1, 2, ...., Tにおいて,学習者は分布パラメータθ_tを予測し,次いで環境から事例x_1を受け取る.このとき学習者は,予測した分布の下でのx_1の負の対数尤度-lnp(x_t\|θ_t)の損失を被る.学習者の性能は,学習者の損失の総和とオフラインの意味での最適な分布パラメータが被る損失の総和との差,すなわちリグレットによって評価する.本稿では,現在の時刻が最終ラウンドであると仮定したときの最適なパラメータ(ミニマックス解)を予測するアルゴリズムを提案する.このアルゴリズムを,最終段ミニマックスアルゴリズムと呼ぶ.1次元指数分布族に対し,最終段ミニマックスアルゴリズムの与える予測パラメータを明示的に示し,さらに,そのリグレットがO(lnT)となることを示す.ここに,Tは総ラウンド数である.特に,ベルヌーイ分布推定問題については,最終段ミニマックスアルゴリズムは,標準的なKrichevsky-Trofimov推定量よりも若干優れていることを示す.
抄録(英)	We consider on-line denxity estimation with a parameterized density from an exponential family. In each trial t the learner predicts a parameter θ_t. Then it receives an instance x_t chosen by the adversary and incurs loss-lnp(x_t\|θ_t) which is the negative log-likelihood af x_t w. r. t. the predicetd density of the learner. The performance of the learner is measured by the regret defined as the total loss of the learner minus the total loss of the best parameter chosen off-line. We develop an algorithm called the Last-step Minimax Algorithm that predicts with the minimax optimal parameter assuming that the current trial is the last one. For one-dimensional exponential families, we give an explicit form of the prediction of the Last-step Minimax Algorithm and show that its regret is O(lnT), where T is the number of trials. In particular, for Bernoulli density estimation the Last-step Minimax Algorithm is slightly better than the standard Krichevsky-Trofimov probability estimator.
キーワード(和)	オンライン分布推定 / オンライン予測 / 指数分布族 / リグレット / ミニマックス戦略
キーワード(英)	On-line density estimation / On-line prediction / Exponential family / Regret / Minimax strategy
資料番号	COMP2003-59
発行日

研究会情報
研究会	COMP
開催期間	2003/11/18(から1日開催)
開催地（和）
開催地（英）
テーマ（和）
テーマ（英）
委員長氏名（和）
委員長氏名（英）
副委員長氏名（和）
副委員長氏名（英）
幹事氏名（和）
幹事氏名（英）
幹事補佐氏名（和）
幹事補佐氏名（英）

講演論文情報詳細
申込み研究会	Theoretical Foundations of Computing (COMP)
本文の言語	ENG
タイトル（和）	最終段ミニマックスアルゴリズム
サブタイトル（和）
タイトル（英）	The Last-Step Minimax Algorithm
サブタイトル（和）
キーワード(1)（和/英）	オンライン分布推定 / On-line density estimation
キーワード(2)（和/英）	オンライン予測 / On-line prediction
キーワード(3)（和/英）	指数分布族 / Exponential family
キーワード(4)（和/英）	リグレット / Regret
キーワード(5)（和/英）	ミニマックス戦略 / Minimax strategy
第 1 著者氏名（和/英）	瀧本英二 / / Eiji TAKIMOTO
第 1 著者所属（和/英）	東北大学大学院情報科学研究科 / GSIS, Tohoku University
発表年月日	2003/11/18
資料番号	COMP2003-59
巻番号（vol）	vol.103
号番号（no）	468
ページ範囲	pp.-
ページ数	6
発行日