Presentation | 2004/9/3 Information Extraction from Various Document Formats Based on PDL Analysis Takashi Hirano, Taizo Kameshiro, Yasuhiro Okada, Fumio Yoda, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We propose a document analysis method which extracts text information form various document format files. In this method, a PDL (Page Description Language) data file is generated by doing dummy printing process of a document file. In the PDL data analysis, while extracting the text from inside of the PDL data, character recognition process for images is carried out. It allows text extraction without extraction loss from various document files, such as electronic document, image document, and CAD data. The design of this method is presented and experimental results are discussed. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | |
Paper # | PRMU2004-66 |
Date of Issue |
Conference Information | |
Committee | PRMU |
---|---|
Conference Date | 2004/9/3(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Pattern Recognition and Media Understanding (PRMU) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Information Extraction from Various Document Formats Based on PDL Analysis |
Sub Title (in English) | |
Keyword(1) | |
1st Author's Name | Takashi Hirano |
1st Author's Affiliation | Mitsubishi Electric Corporation, Information Technology R&D Center() |
2nd Author's Name | Taizo Kameshiro |
2nd Author's Affiliation | Mitsubishi Electric Corporation, Information Technology R&D Center |
3rd Author's Name | Yasuhiro Okada |
3rd Author's Affiliation | Mitsubishi Electric Corporation, Information Technology R&D Center |
4th Author's Name | Fumio Yoda |
4th Author's Affiliation | Mitsubishi Electric Corporation, Information Technology R&D Center |
Date | 2004/9/3 |
Paper # | PRMU2004-66 |
Volume (vol) | vol.104 |
Number (no) | 290 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |