Presentation | 2001/7/11 An Indexing Method for Table Structures of HTML Format Masami SHISHIBORI, Yoshihiro IWAGUCHI, Minsoo JUNG, Jun-ichi AOE, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | HTML documents in the WWW space frequently include the table structure, which has a very useful information, such as the meanings and relations of words in the table. In this paper, we propose the method to construct the index which keeps the relations in the table structure of HTML format. This method represents the position of each item in the table structure as the compact bit stream. Moreover, since the odd bits of this bit stream show the row relation of each item, on the other hand, the even bits are the column relation, it is very easy and quickly to compare the relation of positions of items in the table. From the experiment result using 200 HTML table structures, which are collected from WWW space by hand, it was found that this method can generate 87% percent smaller index and compare the position relations 5.4 times faster than the indexing method storing the row and column coordinates of each item. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Table structure analysis / Indexing / HTML document / Internet searching engine / Information extraction |
Paper # | DE2001-54 |
Date of Issue |
Conference Information | |
Committee | DE |
---|---|
Conference Date | 2001/7/11(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Data Engineering (DE) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | An Indexing Method for Table Structures of HTML Format |
Sub Title (in English) | |
Keyword(1) | Table structure analysis |
Keyword(2) | Indexing |
Keyword(3) | HTML document |
Keyword(4) | Internet searching engine |
Keyword(5) | Information extraction |
1st Author's Name | Masami SHISHIBORI |
1st Author's Affiliation | Dpt.of Information Science & Intelligent Systems, Faculty of Engineering, Tokushima University() |
2nd Author's Name | Yoshihiro IWAGUCHI |
2nd Author's Affiliation | Dpt.of Information Science & Intelligent Systems, Faculty of Engineering, Tokushima University |
3rd Author's Name | Minsoo JUNG |
3rd Author's Affiliation | Dpt.of Information Science & Intelligent Systems, Faculty of Engineering, Tokushima University |
4th Author's Name | Jun-ichi AOE |
4th Author's Affiliation | Dpt.of Information Science & Intelligent Systems, Faculty of Engineering, Tokushima University |
Date | 2001/7/11 |
Paper # | DE2001-54 |
Volume (vol) | vol.101 |
Number (no) | 192 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |