講演名 2011-08-02
Wikipedia version tree reconstruction by clustering revisions through keywords
,
PDFダウンロードページ PDFダウンロードページへ
抄録(和)
抄録(英) As the widespread diffusion of user generated contents, documents having past versions are rapidly growing, especially among the field of wiki contents and office documents. Take Wikipedia for example, it has been the world's largest collaboratively edited source of encyclopedic knowledge. Anybody can edit an article using a wiki markup language that offers a simplified alternative to HTML. For each article, Wikipedia provides a method to export an XML file of an edit history having timestamps, which is essential to evaluate trustworthiness and provenance of the article. The problem is that even though there is an edit history, it is still hard to know how an article has evolved. A tree structure is embedded in the linear structure of the timestamps. To overcome this problem, we propose a version tree reconstruction method by clustering versions through keywords. A version tree can explain how a document has evolved through collaborative editing as well as illuminate dependencies among documents. In this paper, we will show experimental evaluation on a number of edit histories from Wikipedia to validate how our proposed method works.
キーワード(和)
キーワード(英) version tree / Wikipedia / keyword / clustering
資料番号 DE2011-32
発行日

研究会情報
研究会 DE
開催期間 2011/7/26(から1日開催)
開催地(和)
開催地(英)
テーマ(和)
テーマ(英)
委員長氏名(和)
委員長氏名(英)
副委員長氏名(和)
副委員長氏名(英)
幹事氏名(和)
幹事氏名(英)
幹事補佐氏名(和)
幹事補佐氏名(英)

講演論文情報詳細
申込み研究会 Data Engineering (DE)
本文の言語 ENG
タイトル(和)
サブタイトル(和)
タイトル(英) Wikipedia version tree reconstruction by clustering revisions through keywords
サブタイトル(和)
キーワード(1)(和/英) / version tree
第 1 著者 氏名(和/英) / Zhe Cao
第 1 著者 所属(和/英) 早稲田大学大学院情報生産システム研究科
発表年月日 2011-08-02
資料番号 DE2011-32
巻番号(vol) vol.111
号番号(no) 173
ページ範囲 pp.-
ページ数 6
発行日