Presentation | 2013/7/15 XML Documents Searching Combining Structure and Keywords Similarities APICHAYA AUVATTANASOMBAT, YOUSUKE WATANABE, HARUO YOKOTA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In recent years, XML has been increasingly become an emerging standard and widely used in many appli-cations. For example, office documents which are more and more popular used at this time, are also stored in multiple parts of XML archive formats. It is known that the structure and content of XML files play different roles depending on kind of documents. Therefore, achievement similarity search of an XML file should base on both structure and content. In previous work, LAX+ is an algorithm for reckoning a similarity value from structure and contents of XML files in the office documents. However, since LAX+ used exactly matching method between corresponding leaves, similar words in the leaf-nodes are considered as different. To solve the problem, we propose to combine LAX+ with keyword similarity in leaf-nodes. We use docx, xlsx and pptx file formats as experimental data set. The evaluation shows that our approach can be used to improve the precision and recall. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | XML Similarity / OOXML / Keyword Similarity / Document Search |
Paper # | Vol.2013-DBS-157 No.14,Vol.2013-IFAT-111 No.14 |
Date of Issue |
Conference Information | |
Committee | DE |
---|---|
Conference Date | 2013/7/15(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Data Engineering (DE) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | XML Documents Searching Combining Structure and Keywords Similarities |
Sub Title (in English) | |
Keyword(1) | XML Similarity |
Keyword(2) | OOXML |
Keyword(3) | Keyword Similarity |
Keyword(4) | Document Search |
1st Author's Name | APICHAYA AUVATTANASOMBAT |
1st Author's Affiliation | Tokyo Institute of Technology:Chulalongkorn University() |
2nd Author's Name | YOUSUKE WATANABE |
2nd Author's Affiliation | Tokyo Institute of Technology |
3rd Author's Name | HARUO YOKOTA |
3rd Author's Affiliation | Tokyo Institute of Technology |
Date | 2013/7/15 |
Paper # | Vol.2013-DBS-157 No.14,Vol.2013-IFAT-111 No.14 |
Volume (vol) | vol.113 |
Number (no) | 150 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |