Presentation 2017-03-20
Selection of Near-Boundary Data for Semi-Supervised Learning
Ryohei Tanaka, Xiao Ding, Soichiro Ono, Akio Furuhata,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Semi-supervised learning (SSL) is a technique which makes use of unlabeled data in addition to labeled data to obtain better learning accuracies. The most fundamental and widely applicable subtype of SSL is called self-training, which produces additional labeled data from unlabeled data using the results of the classifier trained with the existing labeled data. Conventionally, the additional labeling is performed only on unlabeled data with high prediction confidence above the built-in threshold predetermined by the designer. On the other hand, it is known that learning data near the decision boundary play a crucial role for classification performances. In this paper, we introduce this knowledge to self-training by selectively labeling unconfident unlabeled data near the decision boundary. We also propose a novel method using optimal region accumulation which automatically optimizes the labeling threshold to accumulate data near the boundary.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Semi-supervised learning / self-training / subspace method / handwritten digits recognition
Paper # BioX2016-33,PRMU2016-196
Date of Issue 2017-03-13 (BioX, PRMU)

Conference Information
Committee PRMU / BioX
Conference Date 2017/3/20(2days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair Eisaku Maeda(NTT) / Masakatsu Nishigaki(Shizuoka Univ.)
Vice Chair Seiichi Uchida(Kyushu Univ.) / Hironobu Fujiyoshi(Chubu Univ.) / Akira Otsuka(AIST) / Hiroshi Takano(Toyama Pref. Univ.)
Secretary Seiichi Uchida(Kyoto Univ.) / Hironobu Fujiyoshi(NTT) / Akira Otsuka(NEC) / Hiroshi Takano(AIST)
Assistant Masaki Oonishi(AIST) / Takuya Funatomi(NAIST) / Masatsugu Ichino(Univ. of Electro-Comm.) / Naoyuki Takada(Secom) / Takahiro Aoki(Fujitsu Labs.)

Paper Information
Registration To Technical Committee on Pattern Recognition and Media Understanding / Technical Committee on Biometrics
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Selection of Near-Boundary Data for Semi-Supervised Learning
Sub Title (in English)
Keyword(1) Semi-supervised learning
Keyword(2) self-training
Keyword(3) subspace method
Keyword(4) handwritten digits recognition
1st Author's Name Ryohei Tanaka
1st Author's Affiliation Toshiba Corporation(Toshiba)
2nd Author's Name Xiao Ding
2nd Author's Affiliation Toshiba Corporation(Toshiba)
3rd Author's Name Soichiro Ono
3rd Author's Affiliation Toshiba Corporation(Toshiba)
4th Author's Name Akio Furuhata
4th Author's Affiliation Toshiba Corporation(Toshiba)
Date 2017-03-20
Paper # BioX2016-33,PRMU2016-196
Volume (vol) vol.116
Number (no) BioX-527,PRMU-528
Page pp.pp.1-6(BioX), pp.1-6(PRMU),
#Pages 6
Date of Issue 2017-03-13 (BioX, PRMU)