Skip to content

dataforgoodfr/batch9_dyslexia

Repository files navigation

batch9_dyslexia

OCR devlopment

Install

You first need to install tesseract

On Mac

brew install tesseract

This will install tesseract supporting English. If you want to add other languages (French for instance), you can add:

brew install tesseract-lang

On Windows

  1. Download Binary from https://github.com/UB-Mannheim/tesseract/wiki
  2. Run the executable file to install. It should install it to C:\Program Files (x86)\Tesseract-OCR
  3. Make sure your TESSDATA_PREFIX environment variable is set correctly
  • Go to Control Panel -> System -> Advanced System Settings -> Advanced tab -> Environment Variables... button
  • In System variables window scroll down to TESSDATA_PREFIX. If it's not right, select and click Edit...

On Linux

sudo apt-get update
sudo apt-get install tesseract-ocr
sudo apt-get install libtesseract-dev

Then you should install python package:

pip install tesseract
pip install tesseract-ocr

You can now install Dyslexia packages

You can install this package by cloning the repository and using this command :

cd batch9_dyslexia
pip install .

You can use pip install -e . if you are developing on it.

dyslexia package

Submodules Description
app Main function such as pipeline() or get_results()
io Input / Outputs functions such as load_image()
plots Plots functions such as plot_image()
preprocessing Preprocessing functions such as image_to_gray()
ocr OCR functions using tesseract backend

Using the package

Example using dyslexia

from dyslexia import preprocessing
from dyslexia.io import load_image
from dyslexia.ocr import extract_text_from_image

fpath = 'Exemples/SVT/IMG_20210329_123029.jpg'
image_orig = load_image(fpath)
image_no_shadow = preprocessing.remove_shadow(image_orig)
image_gray = preprocessing.image_to_gray(image_no_shadow, threshold=True)

result = extract_text_from_image(image_gray)

Using the pipeline

from dyslexia.app import get_results

fpath = 'Exemples/SVT/IMG_20210329_123029.jpg'
result = get_results(image_gray)

=======

App

Run app

uvicorn app:app --port 5000

Access swagger : http://127.0.0.1:5000/docs#/

Endpoint

/ocr_file/

Takes as input a file object and outputs the ocr results in the form

{"paragraphs" : ["....", "...."], "bboxes": [[0,0,100,50], [0,100,100,50]]}

Where paragraphs is the list of differents paragraphs and bboxes the coordinates (x1,y1,w,h) for each paragraph

/orc_url/

Takes as input an image and outputs the ocr results in the form

{"paragraphs" : ["....", "...."], "bboxes": [[0,0,100,50], [0,100,100,50]]}

Where paragraphs is the list of differents paragraphs and bboxes the coordinates (x1,y1,w,h) for each paragraph

Example query :

curl -X 'POST' \
  'http://127.0.0.1:5000/ocr_url/?url=https%3A%2F%2Fdata2.unhcr.org%2Fimages%2Fdocuments%2Fbig_4cda85d892a5c0b5dd63b510a9c83e9c9d06e739.jpg' \
  -H 'accept: application/json' \
  -d ''

Docker

docker-compose build

docker-compose up

Eval Scripts

dyslexia eval-txt-folder --truth_path tests/data/truth/ --hypothesis_path tests/data/hypothesis/

output

wer : 0.16666666666666666
mer : 0.16129032258064516
wil : 0.27311827956989243
wip : 0.7268817204301076
hits : 26.0
substitutions : 4.0
deletions : 0.0
insertions : 1.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published