A Survey on History, Present and Perspectives of Document Image Analysis Systems Cover Image

A Survey on History, Present and Perspectives of Document Image Analysis Systems
A Survey on History, Present and Perspectives of Document Image Analysis Systems

Author(s): Iulia-Cristina Stănică, Costin-Anton Boiangiu, Giorgiana Violeta VLĂSCEANU, Marcel PRODAN, Cristian Avatavului, Răzvan-Adrian DEACONESCU, Codrin Val Tăut
Subject(s): Social Sciences, Education, Higher Education
Published by: Carol I National Defence University Publishing House
Keywords: Document Image Analysis System; Optical Character Recognition; OCR; Retroconversion;

Summary/Abstract: We live in the century of technology, where the enormous evolution of data and science has recently favored a strong interest in processing, transmitting, and storing information. If, in the past, only a human mind could extract meaningful information from image data, after decades of dedicated research, scientists have managed to build complex systems that can identify different areas, tables, and texts from scanned documents, all the obtained information being easily accessed and passed by one to another. Books, newspapers, maps, letters, drawings - all types of documents can be scanned and processed in order to become available in a digital format. In the digital world, the storage space is very small compared to physical documents, so these applications will replace millions of old paper volumes with a single memory disk and will be accessible at the same time for anyone using just Internet access and without having a risk of deterioration. Other problems, such as ecological issues, accessibility and flexibility constraints can be solved by the use of document image analysis systems. This article presents the methods and techniques used to process on-paper documents and convert them to electronic ones, starting from pixel level and getting to the level of the entire document. The main purpose of Document Image Analysis Systems is to recognize texts and graphical interpretations from images, extract, format and present their contained information accordingly to the people’s needs. We will also try to provide solid ground for practitioners that implement systems from this category to enhance the unsupervised processing features in order to make physical documents easily available to the masses.

  • Issue Year: 15/2019
  • Issue No: 01
  • Page Range: 188-193
  • Page Count: 6
  • Language: English