Skip to Main Content (Press Enter)

Logo CNR
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze

UNI-FIND
Logo CNR

|

UNI-FIND

cnr.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Competenze
  1. Pubblicazioni

Large-Scale Optical Character Recognition of Ancient Greek

Articolo
Data di Pubblicazione:
2017
Abstract:
This paper documents our campaign to undertake the large-scale optical character recognition of ancient, or polytonic, Greek. Building upon the Gamera OCR engine and developing a suite of post-processing tools, including automatic spellcheck, we processed 1,200 volumes comprising 329,002,271 Greek words. A sample of 10 pages is studied in detail; they demonstrate the degree to which each step of post-processing improved the results, and with which source documents. These pages attain an average character accuracy of about 96%. These results will provide a basis for further improvements, including the training of other open-source OCR engines.
Tipologia CRIS:
01.01 Articolo in rivista
Keywords:
OCR; Ancient Greek
Elenco autori:
Boschetti, Federico
Autori di Ateneo:
BOSCHETTI FEDERICO
Link alla scheda completa:
https://iris.cnr.it/handle/20.500.14243/340936
Pubblicato in:
MOUSEION
Journal
  • Dati Generali

Dati Generali

URL

https://doi.org/10.3138/mous.14.3-3
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0 | Sorgente dati: PREPROD (Ribaltamento disabilitato)