Bradski, G. (2000). The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 25(11), 120–125.

Breuel, T. M. (2008). The OCRopus open source OCR system. Proc. SPIE 6815, Document Recognition and Retrieval XV, 68150F. Electronic Imaging 2005, San Jose, California, USA.

Carrasco, R. C. (2014). An open-source OCR evaluation tool. Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage – DATeCH’ 14 (pp. 179–184).

Hegghammer, T. (2022). OCR with Tesseract, Amazon Textract, and Google Document AI: A benchmarking experiment. Journal of Computational Social Science, 5(1), 861–882.

Kettunen, K., Koistinen, M., & Kervinen, J. (2020). Ground truth OCR sample data of Finnish historical newspapers and journals in data improvement validation of a re-OCRing process. LIBER Quarterly, 30(1).

Kiessling, B. (2019). Kraken - a universal text recognizer for the humanities. Digital Humanities Conference 2019 (DH2019).

Levenshtein, V. (1965). Binary codes capable of correcting spurious insertions and deletions of ones. Problems of Information Transmission, 1, 8–17.

Luxemburger Wort, (1942). Neues Kleid. Luxemburger Wort, 2.3.1942(61), 1.

Maurer, Y. (2017). Improving the quality of the text, a pilot project to assess and correct the OCR in a multilingual environment. Relying on News Media. Long Term Preservation and Perspectives for Our Collective Memory.

Neudecker, C., Baierer, K., Federbusch, M., Boenig, M., Würzner, K.-M., Hartmann, V., & Herrmann, E. (2019). OCR-D: An end-to-end open source OCR framework for historical printed documents. Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage (pp. 53–58).

Nguyen, T. T. H., Jatowt, A., Nguyen, N.-V., Coustaty, M., & Doucet, A. (2020). Neural machine translation with BERT for post-OCR error detection and correction. Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (pp. 333–336).

Schneider, P. (2021). Combining morphological and histogram based text line segmentation in the OCR Context. Journal of Data Mining & Digital Humanities, 2021 (HistoInformatics).

Schneider, P., & Maurer Y. (2022). Rerunning OCR - A machine learning approach to quality assessment and enhancement prediction. Journal of Data Mining and Digital Humanities.

Smith, R. (2007). An overview of the Tesseract OCR engine. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Curitiba Brazil (pp. 629–633).

Soper, E., Fujimoto, S., & Yu, Y.-Y. (2021). BART for post-correction of OCR newspaper text. Proceedings of the Seventh Workshop on Noisy User-Generated Text (W-NUT 2021) (pp. 284–290).

The Luxembourg Government. (n.d). The AI4gov initiative. Retrieved November 2, 2022, from

Van de Camp, M. (2008). Explorations into unsupervised corpus quality assessment (Doctoral dissertation. Tilburg Univiersity, The Netherlands). Retrieved November 9, 2022, from