An integrated system for the analysis and the recognition of characters in ancient documents
Conference Paper
Publication Date:
2002
abstract:
This paper describes an integrated system for processing and analyzing highly degraded ancient printed documents. For each page, the system reduces noise by wavelet-based filtering, extracts and segments the text lines into characters by a fast adaptive thresholding, and performs OCR by a feed-forward back-propagation multilayer neural network. The probability recognition is used as a discriminant parameter for determining the automatic activation of a feed-back process, leading back to a block for refining segmentation. This block acts only on the small portions of the text where the recognition was not trustable, and makes use of blind deconvolution and MRF-based segmentation techniques.
The experimental results highlight the good performance of the
whole system in the analysis of even strongly degraded texts.
Iris type:
04.01 Contributo in Atti di convegno
Keywords:
Characters recognition
List of contributors: