分類規則を学習するときに確率1で真のモデルを選択する情報量基準のクラスについて

鈴木 譲

Presentation	1997/7/18 On the Error Probability of Model Selection for Classification Joe Suzuki,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	We consider model selection based on information criteria for classification. In classification, a class y is guessed from an attribute x based on the true conditional probability P(y\|x), where x∈X, y∈Y, and X and Y are infinite and finite sets. In model selection, given examples, we select the model that minimizes an information criterion. The information criteria we address in this paper are expressed in the form of the empirical entropy plus a compensation term (k(g)/2)d(n), where k(g) is the number of independent parameters in a model g, d(n) is a function of n by which the information criterion is characterized, and n is the number of examples. We derive for arbitrary d(・) the asymptotically exact error probability in model selection. Although it was known for autoregressive processes that^1 d(n)=log log n is the minimum function of n such that the model selection satisfies strong consistency, the problem whether the same thing holds for classification has been open. We solve this problem in the affirmative. Additionally, we derive for the d(・) that satisfy weak consistency the expected Kullback-leibler divergence between a true conditional probability P(y\|x) and the conditional probability P^^^(y\|x) estimated by the model selection and a parameter estimator. The derived value is k(g^)/(2n), where g^ is a true model, and the accumulated value over n time instances is computed as (k(g^*)/2)log n+O(1), which implies the optimality of a predictive coding based on the model selection.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	model selection / error probability / weak/strong consistency / Kullback-Leibler divergence / minimum description length principle / Hannan and Quinn's procedure / unseparated/separated models / Kolmogorov's law of the iterated logarithm
Paper #	IT97-22
Date of Issue

Conference Information
Committee	IT
Conference Date	1997/7/18(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Information Theory (IT)
Language	ENG
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	On the Error Probability of Model Selection for Classification
Sub Title (in English)
Keyword(1)	model selection
Keyword(2)	error probability
Keyword(3)	weak/strong consistency
Keyword(4)	Kullback-Leibler divergence
Keyword(5)	minimum description length principle
Keyword(6)	Hannan and Quinn's procedure
Keyword(7)	unseparated/separated models
Keyword(8)	Kolmogorov's law of the iterated logarithm
1st Author's Name	Joe Suzuki
1st Author's Affiliation	Department of Mathematics, Osaka University()
Date	1997/7/18
Paper #	IT97-22
Volume (vol)	vol.97
Number (no)	180
Page	pp.pp.-
#Pages	6
Date of Issue