Presentation 2005/7/15
E-mail Filtering Focusing on E-mail Field and Feature Selection : Naive Bayes Filtering with From Field and Jeffreys Perks
Noriaki KAWAMAE,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we discuss the methods of email filtering, to assign emails to the appropriate folders based on the user's classification strategy. Our contribution is to focus on the E-mail field with comparative study of email filtering in the framework of text filtering based on Naive Bayes. Although most previous works deal with email by the conventional text filtering method as the same way, email has the different characteristics comparing with other kinds of texts such as news and web pages so on. Characteristics of email are assumed as (1) the small number of terms included in it, and (2) its structure of fields. Considering these characteristics, we propose the methods to deal with email structure, estimate the term probability and the feature selection using the term weighting method to make the text filtering based on Naive Bayes optimized for email filtering. The experiment on a huge mail archive, the ENRON corpus, shows that the methods using Jeffreys Perks low improved about 10% of prediction accuracy for variety of known feature selection methods.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Text Filtering / E-mail Filtering / Naive Bayes / Jeffreys Perks / ENRON CORPUS
Paper # NLC2005-5
Date of Issue

Conference Information
Committee NLC
Conference Date 2005/7/15(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) E-mail Filtering Focusing on E-mail Field and Feature Selection : Naive Bayes Filtering with From Field and Jeffreys Perks
Sub Title (in English)
Keyword(1) Text Filtering
Keyword(2) E-mail Filtering
Keyword(3) Naive Bayes
Keyword(4) Jeffreys Perks
Keyword(5) ENRON CORPUS
1st Author's Name Noriaki KAWAMAE
1st Author's Affiliation NTT Information Sharing Platform Laboratories()
Date 2005/7/15
Paper # NLC2005-5
Volume (vol) vol.105
Number (no) 203
Page pp.pp.-
#Pages 7
Date of Issue