Abstract

Federico Boschetti, Matteo Romanello, Alison Babeu, David Bamman, Gregory Crane, Improving OCR Accuracy for Classical Critical Editions

This paper describes a work-flow designed to populate a digital library of ancient Greek critical editions with highly accurate OCR scanned text. While the most recently available OCR engines are now able after suitable training to deal with the polytonic Greek fonts used in 19th and 20th century editions, further improvements can also be achieved with postprocessing. In particular, the progressive multiple alignment method applied to different OCR outputs based on the same images is discussed in this paper.

Paolo Monella Curriculum
DH bibliography
Paolo Monella home page