講演名 2004/11/12
Artistic Line Extraction from Indian Documents
,
PDFダウンロードページ PDFダウンロードページへ
抄録(和)
抄録(英) There are printed artistic documents where text lines of a single page may not be parallel to each other. These text lines may have different orientations or may be curved shapes. For the Optical Character Recognition (OCR) of these documents, we need to extract such lines properly. Because of multi-oriented and curved behaviour it is very difficult to extract different text lines from the document. In this paper, we propose a water reservoir principle based scheme to extract individual text lines from printed Indian artistic documents. In the proposed scheme, at first, analyzing the area of the reservoirs obtained in a component, we compute mode (portrait, landscape, reverse portrait reverse landscape) of the component. Next based on the mode and the water reservoir features like number of reservoirs, height of reservoirs, overlapping portion of two reservoirs, etc. the components are grouped into isolated or touching class. Next depending on reservoir base-area and loops of a component, some candidate envelope points are detected. Each touching component is then classified, either straight or curve type depending on the candidate envelope points of the component. Based on the type of a component two boundary points are computed from each touching component. Finally, candidate regions (neighborhoods) of the boundary points of each component are detected and analyzing these candidate regions, individual text lines are segmented.
キーワード(和)
キーワード(英) Text line extraction / Artistic document analysis / Multi-oriented document recognition / Indian document analysis
資料番号 PRMU2004-116,HIP2004-56
発行日

研究会情報
研究会 PRMU
開催期間 2004/11/12(から1日開催)
開催地(和)
開催地(英)
テーマ(和)
テーマ(英)
委員長氏名(和)
委員長氏名(英)
副委員長氏名(和)
副委員長氏名(英)
幹事氏名(和)
幹事氏名(英)
幹事補佐氏名(和)
幹事補佐氏名(英)

講演論文情報詳細
申込み研究会 Pattern Recognition and Media Understanding (PRMU)
本文の言語 ENG
タイトル(和)
サブタイトル(和)
タイトル(英) Artistic Line Extraction from Indian Documents
サブタイトル(和)
キーワード(1)(和/英) / Text line extraction
第 1 著者 氏名(和/英) / Umapada Pal
第 1 著者 所属(和/英)
Computer Vision and Pattern Recognition Unit, Indian Statistical Institute
発表年月日 2004/11/12
資料番号 PRMU2004-116,HIP2004-56
巻番号(vol) vol.104
号番号(no) 448
ページ範囲 pp.-
ページ数 6
発行日