Serbian Early Printed Books from Venice: Creating Models for Automatic Text Recognition Using Transkribus Cover Image
  • Price 4.50 €

Serbian Early Printed Books from Venice: Creating Models for Automatic Text Recognition Using Transkribus
Serbian Early Printed Books from Venice: Creating Models for Automatic Text Recognition Using Transkribus

Author(s): Vladimir Polomac
Subject(s): Language and Literature Studies, Applied Linguistics, Studies of Literature, Computational linguistics, Serbian Literature, Philology
Published by: Институт за литература - БАН
Keywords: Transkribus; Automatic Text Recognition; Serbian Early Printed Books; Artificial Intelligence; Machine Learning; Venice.

Summary/Abstract: The paper describes the process of creating a model for the automatic recognition of Serbian Church Slavonic printed books from Venice (from Božidar and Vincenzo Vuković’s printery) by using the Transkribus software platform, based on the principles of artificial intelligence and machine learning. By using the example of Prayer Book (Euchologion) (1538–1540) from Božidar Vuković’s printery, it has been shown that a successful model for the automatic recognition of individual books (with around 5% of unrecognized characters) can also be trained on the material consisting of approximately 4000 words, and that the increased amount of training material (in our case around 38000 words) leads to the improvement of the model and reduced error rate (between 1–2% of unrecognized characters). The most notable result of the paper is manifested through the creation of a generic model for the automatic text recognition of Serbian Church Slavonic books from Božidar and Vincenzo Vuković’s printery. The initial version of the generic model (called Dionisio 1.0. by the Božidar Vuković’s Italian pseudonym – Dionisio della Vecchia) is the first resource for the automatic recognition of the Serbian medieval Cyrillic script, publicly available to all users of the Transkribus software platform (see https://readcoop.eu/model/dionisio-1-0/).

  • Issue Year: 2022
  • Issue No: 22
  • Page Range: 11-29
  • Page Count: 19
  • Language: English