Presentation 2008-09-21
Efficient Spam Post Detection by Compression-based Measure Using Suffix Trees
Takashi UEMURA, Daisuke IKEDA, Hiroki ARIMURA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we propose a content-based spam detection algorithm for blog spams and bulletin board spams. For a given document set D, our algorithm constructs a probabilistic model by using suffix trees, and detects spam documents in D. Experimental results showed that our algorithm performs well for detecting word salad spams, which are believed to be difficult to detect automatically.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) spam detection / suffix trees / probability estimation
Paper # DE2008-37
Date of Issue

Conference Information
Committee DE
Conference Date 2008/9/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Data Engineering (DE)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Efficient Spam Post Detection by Compression-based Measure Using Suffix Trees
Sub Title (in English)
Keyword(1) spam detection
Keyword(2) suffix trees
Keyword(3) probability estimation
1st Author's Name Takashi UEMURA
1st Author's Affiliation Graduate School of Information Science and Technology, Hokkaido University()
2nd Author's Name Daisuke IKEDA
2nd Author's Affiliation Department of Informatics, Graduate School of Information Science and Electrical Engineering Kyushu University
3rd Author's Name Hiroki ARIMURA
3rd Author's Affiliation Graduate School of Information Science and Technology, Hokkaido University
Date 2008-09-21
Paper # DE2008-37
Volume (vol) vol.108
Number (no) 211
Page pp.pp.-
#Pages 2
Date of Issue