Presentation | 2004/2/12 Development of Document Retrieval System Tolerant of Segmentation Errors of Document Images Takeshi NAGASAKI, Katsumi MARUKAWA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes a new method for document retrieval which is tolerant of segmentation errors of OCR on document images. OCR-based document retrieval systems suffer from segmentation and recognition errors. The proposed method consists of two phases of image processing to overcome these problems. First, the OCR engine outputs the multiple hypotheses of character segmentation and recognition. Second, the retrieval engine extracts several keywords from the hypotheses using lexicon driven DP-matching. We have applied this method to handwritten and printed document images, and demonstrated its effectiveness in reducing false drops and false alarms of retrieval. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Segmentation Error / OCR / Document Retrieval / Lexicon Driven Dynamic Programming |
Paper # | TL2003-29,PRMU2003-215 |
Date of Issue |
Conference Information | |
Committee | PRMU |
---|---|
Conference Date | 2004/2/12(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Pattern Recognition and Media Understanding (PRMU) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Development of Document Retrieval System Tolerant of Segmentation Errors of Document Images |
Sub Title (in English) | |
Keyword(1) | Segmentation Error |
Keyword(2) | OCR |
Keyword(3) | Document Retrieval |
Keyword(4) | Lexicon Driven Dynamic Programming |
1st Author's Name | Takeshi NAGASAKI |
1st Author's Affiliation | Hitachi, Ltd., Central Research Laboratory() |
2nd Author's Name | Katsumi MARUKAWA |
2nd Author's Affiliation | Hitachi, Ltd., Central Research Laboratory |
Date | 2004/2/12 |
Paper # | TL2003-29,PRMU2003-215 |
Volume (vol) | vol.103 |
Number (no) | 658 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |