Presentation | 2008-09-21 Efficient Spam Post Detection by Compression-based Measure Using Suffix Trees Takashi UEMURA, Daisuke IKEDA, Hiroki ARIMURA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, we propose a content-based spam detection algorithm for blog spams and bulletin board spams. For a given document set D, our algorithm constructs a probabilistic model by using suffix trees, and detects spam documents in D. Experimental results showed that our algorithm performs well for detecting word salad spams, which are believed to be difficult to detect automatically. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | spam detection / suffix trees / probability estimation |
Paper # | DE2008-37 |
Date of Issue |
Conference Information | |
Committee | DE |
---|---|
Conference Date | 2008/9/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Data Engineering (DE) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Efficient Spam Post Detection by Compression-based Measure Using Suffix Trees |
Sub Title (in English) | |
Keyword(1) | spam detection |
Keyword(2) | suffix trees |
Keyword(3) | probability estimation |
1st Author's Name | Takashi UEMURA |
1st Author's Affiliation | Graduate School of Information Science and Technology, Hokkaido University() |
2nd Author's Name | Daisuke IKEDA |
2nd Author's Affiliation | Department of Informatics, Graduate School of Information Science and Electrical Engineering Kyushu University |
3rd Author's Name | Hiroki ARIMURA |
3rd Author's Affiliation | Graduate School of Information Science and Technology, Hokkaido University |
Date | 2008-09-21 |
Paper # | DE2008-37 |
Volume (vol) | vol.108 |
Number (no) | 211 |
Page | pp.pp.- |
#Pages | 2 |
Date of Issue |