Presentation 2011-03-10
Document Image Binarization to Extract Text Patterns from Low Resolution Color Images
Hiroshi Tanaka, Yusaku Fujii, Yoshinobu Hotta,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We propose a text image binarization method which is robust for image resolution. Common text binarization method has two steps, text region extraction step and regional binarization step. Because our system uses Niblack binarization for the second step, it may cause problems such that pixels of narrow strokes are dropped off in low resolution images. We adopt a threshold correction method which can restore dropped pixels and improve the quality of text binary images. Evaluation results of character recognition using 150-600 dpi images show the effectiveness of our method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Binarization / Threshold / OCR / Document Image / Character Recognition / Image Resolution
Paper # PRMU2010-254
Date of Issue

Conference Information
Committee PRMU
Conference Date 2011/3/3(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Pattern Recognition and Media Understanding (PRMU)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Document Image Binarization to Extract Text Patterns from Low Resolution Color Images
Sub Title (in English)
Keyword(1) Binarization
Keyword(2) Threshold
Keyword(3) OCR
Keyword(4) Document Image
Keyword(5) Character Recognition
Keyword(6) Image Resolution
1st Author's Name Hiroshi Tanaka
1st Author's Affiliation Fujitsu Laboratories Ltd.()
2nd Author's Name Yusaku Fujii
2nd Author's Affiliation Fujitsu Laboratories Ltd.
3rd Author's Name Yoshinobu Hotta
3rd Author's Affiliation Fujitsu Laboratories Ltd.
Date 2011-03-10
Paper # PRMU2010-254
Volume (vol) vol.110
Number (no) 467
Page pp.pp.-
#Pages 6
Date of Issue