Computational Transcription of Medieval Hebrew Manuscripts and Crowdsourcing their Corrections

We will present initial results on two computational projects on Medieval Hebrew manuscripts. The first, Sofer Mahir, applies an HTR (handwritten text recognition) pipeline constructed at Scripta-PSL to the major manuscripts of the classical compositions of the tannaitic period of Rabbinic Judaism. In the frame of the second project, Tikkoun Sofrim, which applies the pipeline to manuscripts of early Medieval Tanhuma-Yelamdenu Midrashim, we have developed a crowdsourcing platform that permits citizen scientists to suggest corrections to the automatic transcription.

Speakers

Daniel Stökl Ben Ezra
Daniel Stökl Ben Ezra
Research Professor École Pratique des Hautes Études (EPHE), Paris

A graduate of the Hebrew University of Jerusalem (2001), Daniel Stökl Ben Ezra is research professor of Ancient Hebrew and Aramaic since 2011. His main work concerns Dead Sea Scrolls (Qumran, Stuttgart: UTB 2016) and early Rabbinic literature (editions.erabbinica.org with Hayim Lapin) as well as Jewish-Christian relations: Impact of Yom Kippur on Early Christianity (Tübingen: Mohr-Siebeck 2003). He has been director of the digital humanities programme at the EPHE (2013-2018) and is currently director of the Scripta-numérique project of PSL creating a VRE combining AI (computer vision, NLP, machine learning) and DH for the studying of manuscripts and inscriptions.