Publication Date:
2011
abstract:
A number of methods to extract information from digital images of documents are described. The appearance of a document can be seen as the superposition of a number of information layers (the "patterns"), and is represented by a vector image, whose components (the "channels") are entailed by the type of diversity used to capture the image. Our data model considers each channel as a function of all the patterns. Starting from the appearance data, the mathematical model chosen and some physical and statistical constraints for the patterns are used to develop a strategy to isolate the different patterns. In many cases, this allows us to separate features that are superimposed to one another. Finally, examples are shown where the strategies introduced are used to either clean the document appearance (mitigation of interferences) or extract partially hidden or entangled patterns, such as stamps, watermarks, and erased strokes.
Iris type:
02.01 Contributo in volume (Capitolo o Saggio)
Keywords:
Document image processing; Virtual restoration; Pattern extraction
List of contributors:
Salerno, Emanuele; Tonazzini, Anna
Book title:
Selected papers from DSP application day 2010