Presentation 2002/10/10
Online Algorithms for Mining Semi-structured Data Stream with Sliding Window and Forgetting Factor
Tatsuya ASAI, Hiroki ARIMURA, Kenji ABE, Shinji KAWASOE, Setsuo ARIKAWA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we study an online data mining problem from streams of semi-structured data such as XML data. Modeling semi-structured data and patterns as labeled ordered trees, we present an online algorithm StreamT that receives fragments of an unseen possibly infinite semi-structured data in the document order through a data stream, and can return the current set of frequent patterns immediately on request at any time. A crucial part of our algorithm is the incremental maintenance of the occurrences of possibly frequent patterns using a tree sweeping technique. We give modifications of the algorithm to other online mining models. We present theoretical and empirical analyses to evaluate the performance of the algorithm.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) semi-structured data mining / data stream / frequent pattern discovery / online mining algorithm
Paper # DC2002-88
Date of Issue

Conference Information
Committee DE
Conference Date 2002/10/10(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Data Engineering (DE)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Online Algorithms for Mining Semi-structured Data Stream with Sliding Window and Forgetting Factor
Sub Title (in English)
Keyword(1) semi-structured data mining
Keyword(2) data stream
Keyword(3) frequent pattern discovery
Keyword(4) online mining algorithm
1st Author's Name Tatsuya ASAI
1st Author's Affiliation Department of Informatics, Kyushu University()
2nd Author's Name Hiroki ARIMURA
2nd Author's Affiliation Department of Informatics, Kyushu University:PRESTO, JST
3rd Author's Name Kenji ABE
3rd Author's Affiliation Department of Informatics, Kyushu University
4th Author's Name Shinji KAWASOE
4th Author's Affiliation Department of Informatics, Kyushu University
5th Author's Name Setsuo ARIKAWA
5th Author's Affiliation Department of Informatics, Kyushu University
Date 2002/10/10
Paper # DC2002-88
Volume (vol) vol.102
Number (no) 375
Page pp.pp.-
#Pages 6
Date of Issue