講演名 | 2006-11-24 Recognition-based Segmentation for Digitization of Korean Historical Document Pages(Character and document processing) , |
---|---|
PDFダウンロードページ | PDFダウンロードページへ |
抄録(和) | |
抄録(英) | We present a recognition-based digitization method for building digital library of large amount of historical archives. Digitization of historical document pages is essential for providing retrieval service and preventing from damages but needs laborious manual verification for accurate output. In this paper, split-merge approach is applied for segmenting overlapped and touched characters written by thick brushes. Character string images are split into primitive segments by nonlinear segmentation paths passing maximum curvature points. Split segments are merged in single probabilistic framework integrated by layout analysis, context information and recognition result. In experiment, our system achieved 96.4% character recognition rates on test data set, despite the obsolete characters and unique variants used in the archives. In conclusion, our method can be applied for digitizing Korean historical document pages and minimize manual verification. |
キーワード(和) | |
キーワード(英) | Historical document pages / Digital library / Digitization of documents / Character segmentation |
資料番号 | PRMU2006-144 |
発行日 |
研究会情報 | |
研究会 | PRMU |
---|---|
開催期間 | 2006/11/17(から1日開催) |
開催地(和) | |
開催地(英) | |
テーマ(和) | |
テーマ(英) | |
委員長氏名(和) | |
委員長氏名(英) | |
副委員長氏名(和) | |
副委員長氏名(英) | |
幹事氏名(和) | |
幹事氏名(英) | |
幹事補佐氏名(和) | |
幹事補佐氏名(英) |
講演論文情報詳細 | |
申込み研究会 | Pattern Recognition and Media Understanding (PRMU) |
---|---|
本文の言語 | ENG |
タイトル(和) | |
サブタイトル(和) | |
タイトル(英) | Recognition-based Segmentation for Digitization of Korean Historical Document Pages(Character and document processing) |
サブタイトル(和) | |
キーワード(1)(和/英) | / Historical document pages |
第 1 著者 氏名(和/英) | / Kyu-Tae Cho |
第 1 著者 所属(和/英) | |
発表年月日 | 2006-11-24 |
資料番号 | PRMU2006-144 |
巻番号(vol) | vol.106 |
号番号(no) | 376 |
ページ範囲 | pp.- |
ページ数 | 7 |
発行日 |