Presentation 2001/3/12
Selecting Index Terms with a Off-Line Processing for Case-Based Transformation of HTML Documents
Shinji Suzuki, Koji Iwanuma, Masayuki Umehara,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Recently, we proposed a case-based mechanical transformation method for HTML documents constituting a Series into XML ones. Although the case-based method uses both of syntactical structural features and semantical term occurrences appearing in HTML documents, the transformation method pays more attention to syntactical features than to semantical ones. In this paper, we investigate the importance of semantical features of term occurrences. First we study how to select important index terms from target HTML documents, and also how to integrate tag information denoting the meaning intended to human. Second we use thesaurus for treating synonym. We experimentally evaluate the proposed methods for several HTML pages gathered from actual WEB sites.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) HTML / conversion from HTML into XML / selecting index word / a case-based transformation / weighting / thesaurus
Paper # AI2000-70,KBSE78
Date of Issue

Conference Information
Committee AI
Conference Date 2001/3/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Artificial Intelligence and Knowledge-Based Processing (AI)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Selecting Index Terms with a Off-Line Processing for Case-Based Transformation of HTML Documents
Sub Title (in English)
Keyword(1) HTML
Keyword(2) conversion from HTML into XML
Keyword(3) selecting index word
Keyword(4) a case-based transformation
Keyword(5) weighting
Keyword(6) thesaurus
1st Author's Name Shinji Suzuki
1st Author's Affiliation Dept. of Computer Science and Media Engineering Yamanashi University()
2nd Author's Name Koji Iwanuma
2nd Author's Affiliation Dept. of Computer Science and Media Engineering Yamanashi University
3rd Author's Name Masayuki Umehara
3rd Author's Affiliation Dept. of Computer Science and Media Engineering Yamanashi University
Date 2001/3/12
Paper # AI2000-70,KBSE78
Volume (vol) vol.100
Number (no) 709
Page pp.pp.-
#Pages 4
Date of Issue