Skip to content

Latest commit

 

History

History
49 lines (32 loc) · 2.62 KB

ocr.md

File metadata and controls

49 lines (32 loc) · 2.62 KB
layout title author permalink
default
Extracting Texts with Optical Character Recognition
Joel Kalvesmaki
/ocr/

{{ page.title }}

Led by {{ page.author }}

Welcome

Welcome to the North American Patristics Society workshop on optical character recognition (OCR), which allows you to extract text from an image. In this workshop we will explore how you can use OCR to build a corpus of texts for your research. Focusing on printed sources, primarily in English, secondarily with another language important for early Christian Studies (Greek, Latin, Syriac, and Coptic), we will survey some OCR applications that are available, and then demonstrate actual work through Tesseract.

During the live hands-on session, we will gather as a group, and help one another to debug our processes, and optimize our workflow. We will also discuss

Workshop Materials:

All materials for bulk download

 

Workshop Videos:

Part 1: Welcome and Introduction

 

<iframe width="560" height="315" src="https://www.youtube.com/embed/zkxIwbJltDs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Part 2: The OCR Process

 

<iframe width="560" height="315" src="https://www.youtube.com/embed/4U1ffgxLTws" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Part 3: Setting up a Workflow

 

<iframe width="560" height="315" src="https://www.youtube.com/embed/l75kl_r-pSo" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Preliminary Questions and Answers

Have any questions before we meet? Have an answer to contribute? Please join in at our shared Q&A document.

Live Hand-on Sessions

Live, hands-on sessions are scheduled as follows:

  • Monday May 3, 5:00 p.m. Pacific / 6:00 p.m. Mountain / 7:00 p.m. Central / 8:00 p.m. Eastern
  • Tuesday May 4, 8:00 a.m. Pacific / 9:00 a.m. Mountain / 10:00 a.m. Central / 11:00 a.m. Eastern

You may attend the first, the second, or both. A Zoom link to these sessions will be sent under separate cover.

If you have questions not addressed by the videos, or if you have certain scans or languages you want to see demonstrated, drop an email to [email protected].