Presentation | 2002/3/7 A Study on Document Retrieval System for Large-Scale Database Based on OCR and Character Shape Information Taizo KAMESHIRO, Yoshinori YAMAGISHI, Takashi HIRANO, Yasuhiro OKADA, Fumio YODA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Making a large database of electronic documents from paper documents has left a tremendous problem. In order to search the database for an image document, it is necessary for general electronic filing systems to convert the document into texts using OCR. However, the system cannot retrieve documents that do not contain correct character codes. We had before proposed a document retrieval method that reduces false drops and false alarms by using the "shape-feature" technique that describes the outline of the character's shape. We have studied this method for large-scale database by using parallel processing and confirmed its effect. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Document Image / Character Recognition / Full-Text Search / Shape Feature / Parallel Processing / scalability |
Paper # | NLC2001-96 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2002/3/7(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Study on Document Retrieval System for Large-Scale Database Based on OCR and Character Shape Information |
Sub Title (in English) | |
Keyword(1) | Document Image |
Keyword(2) | Character Recognition |
Keyword(3) | Full-Text Search |
Keyword(4) | Shape Feature |
Keyword(5) | Parallel Processing |
Keyword(6) | scalability |
1st Author's Name | Taizo KAMESHIRO |
1st Author's Affiliation | Information Technology R&D Center, Mitsubishi Electric Corp.() |
2nd Author's Name | Yoshinori YAMAGISHI |
2nd Author's Affiliation | Information & Communication Systems Development Center, Mitsubishi Electric Corp. |
3rd Author's Name | Takashi HIRANO |
3rd Author's Affiliation | Information Technology R&D Center, Mitsubishi Electric Corp. |
4th Author's Name | Yasuhiro OKADA |
4th Author's Affiliation | Information Technology R&D Center, Mitsubishi Electric Corp. |
5th Author's Name | Fumio YODA |
5th Author's Affiliation | Information Technology R&D Center, Mitsubishi Electric Corp. |
Date | 2002/3/7 |
Paper # | NLC2001-96 |
Volume (vol) | vol.101 |
Number (no) | 711 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |