Presentation | 2004/11/12 Artistic Line Extraction from Indian Documents Umapada Pal, Partha Pratim Roy, N. Tripathy, Hiroyuki Hase, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | There are printed artistic documents where text lines of a single page may not be parallel to each other. These text lines may have different orientations or may be curved shapes. For the Optical Character Recognition (OCR) of these documents, we need to extract such lines properly. Because of multi-oriented and curved behaviour it is very difficult to extract different text lines from the document. In this paper, we propose a water reservoir principle based scheme to extract individual text lines from printed Indian artistic documents. In the proposed scheme, at first, analyzing the area of the reservoirs obtained in a component, we compute mode (portrait, landscape, reverse portrait reverse landscape) of the component. Next based on the mode and the water reservoir features like number of reservoirs, height of reservoirs, overlapping portion of two reservoirs, etc. the components are grouped into isolated or touching class. Next depending on reservoir base-area and loops of a component, some candidate envelope points are detected. Each touching component is then classified, either straight or curve type depending on the candidate envelope points of the component. Based on the type of a component two boundary points are computed from each touching component. Finally, candidate regions (neighborhoods) of the boundary points of each component are detected and analyzing these candidate regions, individual text lines are segmented. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Text line extraction / Artistic document analysis / Multi-oriented document recognition / Indian document analysis |
Paper # | PRMU2004-116,HIP2004-56 |
Date of Issue |
Conference Information | |
Committee | PRMU |
---|---|
Conference Date | 2004/11/12(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Pattern Recognition and Media Understanding (PRMU) |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Artistic Line Extraction from Indian Documents |
Sub Title (in English) | |
Keyword(1) | Text line extraction |
Keyword(2) | Artistic document analysis |
Keyword(3) | Multi-oriented document recognition |
Keyword(4) | Indian document analysis |
1st Author's Name | Umapada Pal |
1st Author's Affiliation | Computer Vision and Pattern Recognition Unit, Indian Statistical Institute() |
2nd Author's Name | Partha Pratim Roy |
2nd Author's Affiliation | Computer Vision and Pattern Recognition Unit, Indian Statistical Institute |
3rd Author's Name | N. Tripathy |
3rd Author's Affiliation | Computer Vision and Pattern Recognition Unit, Indian Statistical Institute |
4th Author's Name | Hiroyuki Hase |
4th Author's Affiliation | Computer Vision and Pattern Recognition Unit, Indian Statistical Institute |
Date | 2004/11/12 |
Paper # | PRMU2004-116,HIP2004-56 |
Volume (vol) | vol.104 |
Number (no) | 448 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |