Presentation | 1997/7/18 On the Error Probability of Model Selection for Classification Joe Suzuki, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We consider model selection based on information criteria for classification. In classification, a class y is guessed from an attribute x based on the true conditional probability P(y|x), where x∈X, y∈Y, and X and Y are infinite and finite sets. In model selection, given examples, we select the model that minimizes an information criterion. The information criteria we address in this paper are expressed in the form of the empirical entropy plus a compensation term (k(g)/2)d(n), where k(g) is the number of independent parameters in a model g, d(n) is a function of n by which the information criterion is characterized, and n is the number of examples. We derive for arbitrary d(・) the asymptotically exact error probability in model selection. Although it was known for autoregressive processes that^1 d(n)=log log n is the minimum function of n such that the model selection satisfies strong consistency, the problem whether the same thing holds for classification has been open. We solve this problem in the affirmative. Additionally, we derive for the d(・) that satisfy weak consistency the expected Kullback-leibler divergence between a true conditional probability P(y|x) and the conditional probability P^^^(y|x) estimated by the model selection and a parameter estimator. The derived value is k(g^*)/(2n), where g^* is a true model, and the accumulated value over n time instances is computed as (k(g^*)/2)log n+O(1), which implies the optimality of a predictive coding based on the model selection. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | model selection / error probability / weak/strong consistency / Kullback-Leibler divergence / minimum description length principle / Hannan and Quinn's procedure / unseparated/separated models / Kolmogorov's law of the iterated logarithm |
Paper # | IT97-22 |
Date of Issue |
Conference Information | |
Committee | IT |
---|---|
Conference Date | 1997/7/18(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Information Theory (IT) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | On the Error Probability of Model Selection for Classification |
Sub Title (in English) | |
Keyword(1) | model selection |
Keyword(2) | error probability |
Keyword(3) | weak/strong consistency |
Keyword(4) | Kullback-Leibler divergence |
Keyword(5) | minimum description length principle |
Keyword(6) | Hannan and Quinn's procedure |
Keyword(7) | unseparated/separated models |
Keyword(8) | Kolmogorov's law of the iterated logarithm |
1st Author's Name | Joe Suzuki |
1st Author's Affiliation | Department of Mathematics, Osaka University() |
Date | 1997/7/18 |
Paper # | IT97-22 |
Volume (vol) | vol.97 |
Number (no) | 180 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |