Presentation 2005/3/23
Generalization Error Estimation When Training and Test Input Points Follow Different Probability Distributions
Masashi SUGIYAMA, Klaus Robert MULLER,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) A common assumption in supervised learning is that the training and test input points follow the same probability distribution. However, this assumption is not fulfilled, e.g., in interpolation, extrapolation, or active learning scenarios. The violation of this assumption-known as the covariate shift-causes a heavy bias in standard generalization error estimation schemes such as cross-validation, and thus they result in poor model selection. In this paper, we therefore propose an alternative estimator of the generalization error. Under the covariate shift, the proposed generalization error estimator is exactly unbiased with finite samples if the learning target function is in the model at hand, and it is asymptotically unbiased in general. We experimentally show that model selection with the proposed generalization error estimator is compared favorably to cross-validation in extrapolation.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) linear regression / generalization error / model selection / covariate shift / sample selection bias / interpolation / extrapolation / active learning
Paper # NC2004-215
Date of Issue

Conference Information
Committee NC
Conference Date 2005/3/23(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Neurocomputing (NC)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Generalization Error Estimation When Training and Test Input Points Follow Different Probability Distributions
Sub Title (in English)
Keyword(1) linear regression
Keyword(2) generalization error
Keyword(3) model selection
Keyword(4) covariate shift
Keyword(5) sample selection bias
Keyword(6) interpolation
Keyword(7) extrapolation
Keyword(8) active learning
1st Author's Name Masashi SUGIYAMA
1st Author's Affiliation Department of Computer Science, Tokyo Institute of Technology()
2nd Author's Name Klaus Robert MULLER
2nd Author's Affiliation Fraunhofer FIRST.IDA:Department of Computer Science, University of Potsdam
Date 2005/3/23
Paper # NC2004-215
Volume (vol) vol.104
Number (no) 760
Page pp.pp.-
#Pages 6
Date of Issue