Presentation 2003/7/11
Improvement of the Scalability for Web Log Mining with LCS
Seiji TODA, Haruo YOKOTA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Nowadays, information distribution via websites is one of the most important issues. Therefore, website administrators afe reauired to understand access trends of the websites properly. We investigate a method proposed in our previous work for mining access logs with LCS (Longest Common Subsequences) to extract frequent access sequences. However, the method still has a problem of the scalability. The execution time increases for large websites. In this paper, we propose an aooroach of filtering sequences with a hash function to improve the oerformance for analyzing large websites. We evaluate the efficiency of the approach using access logs of a real website and artificial data.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) web / data mining / longest common subsequences / access sequence extraction / hash / filtering
Paper # DE2003-90
Date of Issue

Conference Information
Committee DE
Conference Date 2003/7/11(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Data Engineering (DE)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Improvement of the Scalability for Web Log Mining with LCS
Sub Title (in English)
Keyword(1) web
Keyword(2) data mining
Keyword(3) longest common subsequences
Keyword(4) access sequence extraction
Keyword(5) hash
Keyword(6) filtering
1st Author's Name Seiji TODA
1st Author's Affiliation Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology()
2nd Author's Name Haruo YOKOTA
2nd Author's Affiliation Global Scientific Information & Computing Center, Tokyo Institute of Technology:Department of Computer Science, Graduate School of Information Science and Engineering, Tokyo Institute of Technology
Date 2003/7/11
Paper # DE2003-90
Volume (vol) vol.103
Number (no) 192
Page pp.pp.-
#Pages 6
Date of Issue