Presentation 2005-03-17
Bayesian Inference in Document Understanding
Jin Hyung Kim,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Recently Bayesian approach gains popularity as basic inference engine for online and offline document understanding. It is applied for almost every step of document analysis and understanding, such as image enhancement, structure modeling, model-based matching, postprocessing by stochastic language models, as well as multiple classifier combination. Such Bayesian approach is applied under the various frameworks such as Hidden Markov Model, Bayesian Network, Markov random field, etc. Large portions of this talk will be devoted on Bayesian Network approach for online and offline character recognition with stochastic stroke analyses. Noticing the value of structure analysis in oriental character recognition, a unified probabilistic framework is developed for modeling shapes of constituent strokes and their relations. Such modeling scheme is a noble combination of structural analysis and statistical approach preserving advantages of both approaches. Utilizing point-stroke-character hierarchy and statistical dependencies in such structure, characters are modeled as a Bayesian Network which overcomes the crudeness of naive Bayesian approach as well as the complexity of brute force Bayesian approach. Therefore, character shapes and stroke relationships are learnable from training data set, and the result of model analysis yields a probability value which can be used for further analysis and combined easily with other analysis results. Briefly mentioned will be recent or on-going works at KAIST on various methodologies and innovative applications of document analysis. The methodologies include example-based Bayesian image enhancement by super-resolution, dependency-based multiple classifier combination, stochastic language modeling techniques from oriental language corpus, while the applications include historical document analysis for digital libraries and interface by writing in 3D space.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Document analysis / Handwritten oriental character recognition / Bayesian network / Stroke shape analysis / Stroke relationship analysis
Paper # TL2004-60,PRMU2004-228
Date of Issue

Conference Information
Committee TL
Conference Date 2005/3/10(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Thought and Language (TL)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Bayesian Inference in Document Understanding
Sub Title (in English)
Keyword(1) Document analysis
Keyword(2) Handwritten oriental character recognition
Keyword(3) Bayesian network
Keyword(4) Stroke shape analysis
Keyword(5) Stroke relationship analysis
1st Author's Name Jin Hyung Kim
1st Author's Affiliation Computer Science Department()
Date 2005-03-17
Paper # TL2004-60,PRMU2004-228
Volume (vol) vol.104
Number (no) 739
Page pp.pp.-
#Pages 87
Date of Issue