Presentation 2010-11-19
Implementing a Web Page Segmentation Method Based on a Role of a Content
Hiroyuki SANO, Tatsuya DOI, Shun SHIRAMATSU, Tadachika OZONO, Toramatsu SHINTANI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Our web page segmentation method divides a web page into Smallest-Blocks, and then assemble some Smallest-Blocks into Content-Blocks. While smallest-Blocks have many roles, we focused on the title of Web contents. We adopted 9 parameters for each Smallest-Block in our decision tree learning, and tried to obtain the extraction of Title-Blocks from web pages. The experimental results show that the system can extract Title-Blocks in a 95% recall.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Web Mining / Web Page Segmentation / Web Page Layout
Paper # AI2010-41
Date of Issue

Conference Information
Committee AI
Conference Date 2010/11/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Artificial Intelligence and Knowledge-Based Processing (AI)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Implementing a Web Page Segmentation Method Based on a Role of a Content
Sub Title (in English)
Keyword(1) Web Mining
Keyword(2) Web Page Segmentation
Keyword(3) Web Page Layout
1st Author's Name Hiroyuki SANO
1st Author's Affiliation Dept. of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology()
2nd Author's Name Tatsuya DOI
2nd Author's Affiliation Dept. of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology
3rd Author's Name Shun SHIRAMATSU
3rd Author's Affiliation Dept. of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology
4th Author's Name Tadachika OZONO
4th Author's Affiliation Dept. of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology
5th Author's Name Toramatsu SHINTANI
5th Author's Affiliation Dept. of Computer Science and Engineering, Graduate School of Engineering, Nagoya Institute of Technology
Date 2010-11-19
Paper # AI2010-41
Volume (vol) vol.110
Number (no) 301
Page pp.pp.-
#Pages 6
Date of Issue