Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

usage of ocrd-cis-postcorrect in ocrd_all #171

Open
cneud opened this issue Aug 20, 2020 · 1 comment
Open

usage of ocrd-cis-postcorrect in ocrd_all #171

cneud opened this issue Aug 20, 2020 · 1 comment
Labels
enhancement New feature or request question Further information is requested

Comments

@cneud
Copy link
Member

cneud commented Aug 20, 2020

Running ocrd_cis/ocrd-cis-postcorrect requires additional components that afaict are currently not installed with ocrd_all.

See cisocrgroup/ocrd_cis#51 (comment)

In order to run our post correction, both our profiler and an according language backend has to be installed on the system. The configuration variable profilerPath (which should be named profilerCommand more appropriately) must point to the profiler executable and the profilerConfig variable must point to the according language configuration file. There is a manual for the profiler and the language backend in our repositories.

  1. Profiler / Installation
  2. Language resources / Installation

The relevant section of the Workflow Guide suggests a workaround

If you don't want to use a profiler, you can set the value for "profilerConfig" to "ignored". In this case, your profiler.bash should look like this: #!/bin/bash cat > /dev/null echo '{}' you need to pass your local path to the model on your hard drive as parameter value for this processor to work!

However, in light of the above comment, how useful is this? Will it still perform corrections and to what extent will the rate of corrections drop without the Profiler?

Should the missing components be included in ocrd_all? Or otherwise could the documentation perhaps be extended with additional documentation on the effect of ocrd-cis-postcorrect with/without the Profiler?

@stweil stweil added the enhancement New feature or request label Aug 21, 2020
@cneud cneud added the question Further information is requested label Sep 26, 2020
@bertsky
Copy link
Collaborator

bertsky commented Oct 8, 2020

However, in light of the above comment, how useful is this? Will it still perform corrections and to what extent will the rate of corrections drop without the Profiler?

AFAICT not very useful. IIRC there still is a re-ranking component running after the profiler that collects and weighs all candidates, but if no profiler runs, then only the pre-existing OCR hypotheses are taken into account (i.e. no edits and no ranking based on maximum entropy adaptation). So if you are running single-OCR and without alternatives, you would get no change – IIUC.

But these are all very good questions that really only @finkf can answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants