Presentation 2002/3/7
A Study on Document Retrieval System for Large-Scale Database Based on OCR and Character Shape Information
Taizo KAMESHIRO, Yoshinori YAMAGISHI, Takashi HIRANO, Yasuhiro OKADA, Fumio YODA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Making a large database of electronic documents from paper documents has left a tremendous problem. In order to search the database for an image document, it is necessary for general electronic filing systems to convert the document into texts using OCR. However, the system cannot retrieve documents that do not contain correct character codes. We had before proposed a document retrieval method that reduces false drops and false alarms by using the "shape-feature" technique that describes the outline of the character's shape. We have studied this method for large-scale database by using parallel processing and confirmed its effect.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Document Image / Character Recognition / Full-Text Search / Shape Feature / Parallel Processing / scalability
Paper # NLC2001-96
Date of Issue

Conference Information
Committee NLC
Conference Date 2002/3/7(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Study on Document Retrieval System for Large-Scale Database Based on OCR and Character Shape Information
Sub Title (in English)
Keyword(1) Document Image
Keyword(2) Character Recognition
Keyword(3) Full-Text Search
Keyword(4) Shape Feature
Keyword(5) Parallel Processing
Keyword(6) scalability
1st Author's Name Taizo KAMESHIRO
1st Author's Affiliation Information Technology R&D Center, Mitsubishi Electric Corp.()
2nd Author's Name Yoshinori YAMAGISHI
2nd Author's Affiliation Information & Communication Systems Development Center, Mitsubishi Electric Corp.
3rd Author's Name Takashi HIRANO
3rd Author's Affiliation Information Technology R&D Center, Mitsubishi Electric Corp.
4th Author's Name Yasuhiro OKADA
4th Author's Affiliation Information Technology R&D Center, Mitsubishi Electric Corp.
5th Author's Name Fumio YODA
5th Author's Affiliation Information Technology R&D Center, Mitsubishi Electric Corp.
Date 2002/3/7
Paper # NLC2001-96
Volume (vol) vol.101
Number (no) 711
Page pp.pp.-
#Pages 8
Date of Issue