Presentation 1994/2/18
Robust and fast text-line extraction from document images
Hideaki Goto, Hirotomo Aso,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The text area extraction is an important process before character recognition for document images.This paper describes a new algorithm for text-line extraction from document images.The principle of the algorithm is grouping together the locally linear elements in the document images,which may be assumed to be text- lines and then be extracted from the images.The algorithm is independent of document structure and is robust for distortion of the image.The primitive rectangles are introduced for the intermediate representation.It is easier and faster to create them than the usual circumscribed rectangles.A method of splitting the bridges between neighboring text-lines is proposed.Combining the bridge splitting process with the text-line extraction,the locally touching text-lines will be extracted as individual ones.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) document image analysis / text-line extraction / primitive rectangle / bridge splitting
Paper # PRU93-134
Date of Issue

Conference Information
Committee PRU
Conference Date 1994/2/18(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Pattern Recognition and Understanding (PRU)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Robust and fast text-line extraction from document images
Sub Title (in English)
Keyword(1) document image analysis
Keyword(2) text-line extraction
Keyword(3) primitive rectangle
Keyword(4) bridge splitting
1st Author's Name Hideaki Goto
1st Author's Affiliation Department of Information Engineering,Faculty of Engineering, Tohoku University()
2nd Author's Name Hirotomo Aso
2nd Author's Affiliation Department of Electrical Communications,Faculty of Engineering, Tohoku University
Date 1994/2/18
Paper # PRU93-134
Volume (vol) vol.93
Number (no) 479
Page pp.pp.-
#Pages 8
Date of Issue