Presentation 2014-11-18
Learning from Positive and Unlabeled Data 2 : Computationally Efficient Estimation of Class Priors
PLESSIS Marthinus Christoffel DU, Gang NIU, Masashi SUGIYAMA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We consider the problem of estimating the class prior in an unlabeled dataset. Under the assumption that an additional labeled dataset is available, the class prior can be estimated by fitting a mixture of class-wise data distributions to the unlabeled data distribution. However, in practice, such an additional labeled dataset is often not available. In this paper, we show that, with additional samples coming only from the positive class, the class prior of the unlabeled dataset can be estimated correctly. Our key idea is to use properly penalized divergences for model fitting to cancel the error caused by the absence of negative samples. We further show that the use of the penalized L_1-distance gives a computationally efficient algorithm with an analytic solution, and establish its uniform deviation bound and estimation error bound. Finally, we experimentally demonstrate the usefulness of the proposed method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Learning from positive and unlabeled data / class-prior estimation / divergence matching
Paper # IBISML2014-66
Date of Issue

Conference Information
Committee IBISML
Conference Date 2014/11/10(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Information-Based Induction Sciences and Machine Learning (IBISML)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Learning from Positive and Unlabeled Data 2 : Computationally Efficient Estimation of Class Priors
Sub Title (in English)
Keyword(1) Learning from positive and unlabeled data
Keyword(2) class-prior estimation
Keyword(3) divergence matching
1st Author's Name PLESSIS Marthinus Christoffel DU
1st Author's Affiliation Department of Complexity Science and Engineering, University of Tokyo()
2nd Author's Name Gang NIU
2nd Author's Affiliation Baidu Inc.
3rd Author's Name Masashi SUGIYAMA
3rd Author's Affiliation Department of Complexity Science and Engineering, University of Tokyo
Date 2014-11-18
Paper # IBISML2014-66
Volume (vol) vol.114
Number (no) 306
Page pp.pp.-
#Pages 7
Date of Issue