Presentation | 2004/10/12 ext Classification Based on Ensemble Learning of Document Component Models Akinori FUJINO, Naonori UEDA, Kazumi SAITO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | For multiclass text classificatin, we propose a new method that considers document components including title, abstract, main content, references, and links. First, a naive Bayes classifier is designed for each document component, in which smoothing parameters are optimally trained by leave-one-out cross validation scheme to boost the generalization performace. Then, based on the maximum entropy principle, a unified classifier is constracted by combined effectively these component classifiers. Through text classification experiments using three sets of real data, we have confirmed the usefulness of the proposed method. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | text classification / ensemble learning / naive Bayes model / maximum entropy principle |
Paper # | NC2004-80 |
Date of Issue |
Conference Information | |
Committee | NC |
---|---|
Conference Date | 2004/10/12(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Neurocomputing (NC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | ext Classification Based on Ensemble Learning of Document Component Models |
Sub Title (in English) | |
Keyword(1) | text classification |
Keyword(2) | ensemble learning |
Keyword(3) | naive Bayes model |
Keyword(4) | maximum entropy principle |
1st Author's Name | Akinori FUJINO |
1st Author's Affiliation | NTT Communication Science Laboratories, NTT Corporation() |
2nd Author's Name | Naonori UEDA |
2nd Author's Affiliation | NTT Communication Science Laboratories, NTT Corporation |
3rd Author's Name | Kazumi SAITO |
3rd Author's Affiliation | NTT Communication Science Laboratories, NTT Corporation |
Date | 2004/10/12 |
Paper # | NC2004-80 |
Volume (vol) | vol.104 |
Number (no) | 349 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |