Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing is slow maybe there is some potential for optimizing with simple means #72

Closed
markusweigelt opened this issue Feb 23, 2024 · 2 comments · Fixed by slub/ocrd_controller#38

Comments

@markusweigelt
Copy link
Collaborator

Processing for a OCR with six tif files is very slow. It needs 2:45 minutes to finish (round about 25 seconds per page)

2024-02-23T17:39:00.625388219Z # ocrd-controller:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.11
2024-02-23T17:39:00.644442553Z # ocrd-controller:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.11
2024-02-23T17:39:00.712902085Z # ocrd-controller:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.11
2024-02-23T17:39:00.720364377Z # ocrd-controller:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.11
2024-02-23T17:39:00.721615933Z # ocrd-controller:22 SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.11
2024-02-23T17:39:00.965586869Z  * Starting enhanced syslogd rsyslogd       �[80G 
�[74G[ OK ]
2024-02-23T17:39:00.996466810Z  * Starting OpenBSD Secure Shell server sshd       �[80G 
�[74G[ OK ]
2024-02-23T17:39:02.999967028Z Feb 23 17:39:01 ocrd-manager rsyslogd: rsyslogd's groupid changed to 106
2024-02-23T17:39:03.000001113Z Feb 23 17:39:01 ocrd-manager rsyslogd: rsyslogd's userid changed to 105
2024-02-23T17:39:03.000102256Z Feb 23 17:39:01 ocrd-manager rsyslogd: [origin software="rsyslogd" swVersion="8.2001.0" x-pid="41" x-info="https://www.rsyslog.com"] start
2024-02-23T17:39:21.001845171Z Feb 23 17:39:20 ocrd-manager process_images.sh: ocr_init initialize variables and directory structure
2024-02-23T17:39:21.001870972Z Feb 23 17:39:20 ocrd-manager process_images.sh: running with --proc-id testdata-kitodo --task-id 1 /data/testdata-kitodo CONTROLLER=ocrd-controller:22 ACTIVEMQ=kitodo-mq:61616
2024-02-23T17:39:21.001876461Z Feb 23 17:39:20 ocrd-manager process_images.sh: using workflow '/workflows/ocr-workflow-default.sh':
2024-02-23T17:39:21.001880104Z Feb 23 17:39:20 ocrd-manager process_images.sh: "tesserocr-recognize -P segmentation_level region -P model frak2021 -I OCR-D-IMG -O OCR-D-OCR" "fileformat-transform -P from-to \"page alto\" -P script-args \"--no-check-border --dummy-word\" -I OCR-D-OCR -O FULLTEXT" 
2024-02-23T17:39:22.001851250Z Feb 23 17:39:21 ocrd-manager process_images.sh: {
2024-02-23T17:39:22.001878423Z Feb 23 17:39:21 ocrd-manager process_images.sh:   acknowledged: true,
2024-02-23T17:39:22.001883055Z Feb 23 17:39:21 ocrd-manager process_images.sh:   insertedId: ObjectId("65d8d849a9a529735a6555b3")
2024-02-23T17:39:22.001886405Z Feb 23 17:39:21 ocrd-manager process_images.sh: }
2024-02-23T17:39:22.001889203Z Feb 23 17:39:21 ocrd-manager process_images.sh: ocr_exit in async mode - immediate termination of the script
2024-02-23T17:39:22.001892372Z Feb 23 17:39:21 ocrd-manager process_images.sh: '/data/testdata-kitodo/images' -> 'ocr-d//data/testdata-kitodo/images'
2024-02-23T17:39:22.001895671Z Feb 23 17:39:21 ocrd-manager process_images.sh: '/data/testdata-kitodo/images/00000009.tif.original.jpg' -> 'ocr-d//data/testdata-kitodo/images/00000009.tif.original.jpg'
2024-02-23T17:39:22.001898677Z Feb 23 17:39:21 ocrd-manager process_images.sh: '/data/testdata-kitodo/images/00000010.tif.original.jpg' -> 'ocr-d//data/testdata-kitodo/images/00000010.tif.original.jpg'
2024-02-23T17:39:22.001901520Z Feb 23 17:39:21 ocrd-manager process_images.sh: '/data/testdata-kitodo/images/00000011.tif.original.jpg' -> 'ocr-d//data/testdata-kitodo/images/00000011.tif.original.jpg'
2024-02-23T17:39:22.001914627Z Feb 23 17:39:21 ocrd-manager process_images.sh: '/data/testdata-kitodo/images/00000012.tif.original.jpg' -> 'ocr-d//data/testdata-kitodo/images/00000012.tif.original.jpg'
2024-02-23T17:39:22.001918256Z Feb 23 17:39:21 ocrd-manager process_images.sh: '/data/testdata-kitodo/images/00000013.tif.original.jpg' -> 'ocr-d//data/testdata-kitodo/images/00000013.tif.original.jpg'
2024-02-23T17:39:22.001921254Z Feb 23 17:39:21 ocrd-manager process_images.sh: '/data/testdata-kitodo/images/00000014.tif.original.jpg' -> 'ocr-d//data/testdata-kitodo/images/00000014.tif.original.jpg'
2024-02-23T17:39:22.001924138Z Feb 23 17:39:21 ocrd-manager process_images.sh: sending incremental file list
2024-02-23T17:39:22.001926858Z Feb 23 17:39:21 ocrd-manager process_images.sh: created directory /data/KitodoJob_91_testdata-kitodo
2024-02-23T17:39:22.001929509Z Feb 23 17:39:21 ocrd-manager process_images.sh: ./
2024-02-23T17:39:22.001932006Z Feb 23 17:39:21 ocrd-manager process_images.sh: ocrd.log
2024-02-23T17:39:22.001934460Z Feb 23 17:39:21 ocrd-manager process_images.sh: images/
2024-02-23T17:39:22.001936996Z Feb 23 17:39:21 ocrd-manager process_images.sh: images/00000009.tif.original.jpg
2024-02-23T17:39:22.001939606Z Feb 23 17:39:21 ocrd-manager process_images.sh: images/00000010.tif.original.jpg
2024-02-23T17:39:22.001943330Z Feb 23 17:39:21 ocrd-manager process_images.sh: images/00000011.tif.original.jpg
2024-02-23T17:39:22.001946124Z Feb 23 17:39:21 ocrd-manager process_images.sh: images/00000012.tif.original.jpg
2024-02-23T17:39:22.001948693Z Feb 23 17:39:21 ocrd-manager process_images.sh: images/00000013.tif.original.jpg
2024-02-23T17:39:22.001951396Z Feb 23 17:39:21 ocrd-manager process_images.sh: images/00000014.tif.original.jpg
2024-02-23T17:39:22.001953868Z Feb 23 17:39:21 ocrd-manager process_images.sh: 
2024-02-23T17:39:22.001956375Z Feb 23 17:39:21 ocrd-manager process_images.sh: sent 2,485,610 bytes  received 221 bytes  4,971,662.00 bytes/sec
2024-02-23T17:39:22.001959075Z Feb 23 17:39:21 ocrd-manager process_images.sh: total size is 2,484,374  speedup is 1.00
2024-02-23T17:39:23.002055787Z Feb 23 17:39:22 ocrd-manager process_images.sh: WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2024-02-23T17:39:23.002082296Z Feb 23 17:39:22 ocrd-manager process_images.sh: 2024-02-23 17:39:22 INFO  KitodoActiveMQClient:76 - Sending of message for taskId='1' was successful
2024-02-23T17:39:23.002086840Z Feb 23 17:39:22 2024-02-23 17: 39:22 INFO  KitodoActiveMQClient:76 - Sending of message for taskId='1' was successful
2024-02-23T17:39:23.002090172Z Feb 23 17:39:22 ocrd-manager process_images.sh: execute 3 commands via SSH by the controller
2024-02-23T17:39:23.002092977Z Feb 23 17:39:22 ocrd-manager process_images.sh: set -Ee#015
2024-02-23T17:39:23.002095618Z Feb 23 17:39:22 ocrd-manager process_images.sh: cd 'KitodoJob_91_testdata-kitodo'#015
2024-02-23T17:39:23.002098283Z Feb 23 17:39:22 ocrd-manager process_images.sh: echo $$ > ocrd.pid#015
2024-02-23T17:39:23.002101086Z Feb 23 17:39:22 ocrd-manager process_images.sh: if test -f mets.xml; then OV=--overwrite; else OV=; ocrd-import -j 1 -i; fi#015
2024-02-23T17:39:23.002117815Z Feb 23 17:39:22 ocrd-manager process_images.sh: ocrd validate tasks $OV --workspace . "tesserocr-recognize -P segmentation_level region -P model frak2021 -I OCR-D-IMG -O OCR-D-OCR" "fileformat-transform -P from-to \"page alto\" -P script-args \"--no-check-border --dummy-word\" -I OCR-D-OCR -O FULLTEXT" #015
2024-02-23T17:39:23.002122169Z Feb 23 17:39:22 ocrd-manager process_images.sh: ocrd process $OV "tesserocr-recognize -P segmentation_level region -P model frak2021 -I OCR-D-IMG -O OCR-D-OCR" "fileformat-transform -P from-to \"page alto\" -P script-args \"--no-check-border --dummy-word\" -I OCR-D-OCR -O FULLTEXT" #015
2024-02-23T17:39:23.002125797Z Feb 23 17:39:22 ocrd-manager process_images.sh: /data$ set -Ee#015
2024-02-23T17:39:23.002128389Z Feb 23 17:39:22 ocrd-manager process_images.sh: /data$ cd 'KitodoJob_91_testdata-kitodo'#015
2024-02-23T17:39:23.002131015Z Feb 23 17:39:22 ocrd-manager process_images.sh: /data/KitodoJob_91_testdata-kitodo$ echo $$ > ocrd.pid#015
2024-02-23T17:39:23.002133778Z Feb 23 17:39:22 ocrd-manager process_images.sh: /data/KitodoJob_91_testdata-kitodo$ #015<n OV=--overwrite; else OV=; ocrd-import -j 1 -i; fi#015
2024-02-23T17:39:34.003322229Z Feb 23 17:39:33 ocrd-manager process_images.sh: 17:39:33.646 INFO ocrd.resolver.workspace_from_nothing - Writing METS to /data/KitodoJob_91_testdata-kitodo/mets.xml#015
2024-02-23T17:39:34.003346640Z Feb 23 17:39:33 ocrd-manager process_images.sh: /data/KitodoJob_91_testdata-kitodo#015
2024-02-23T17:39:41.004753306Z Feb 23 17:39:40 ocrd-manager process_images.sh: 17:39:40.542 INFO ocrd-import - adding -g p0001 -G OCR-D-IMG -m image/jpeg -i f00000009_tif_original 'images/00000009.tif.original.jpg'#015
2024-02-23T17:39:45.005192068Z Feb 23 17:39:44 ocrd-manager process_images.sh: 17:39:44.961 INFO ocrd-import - adding -g p0002 -G OCR-D-IMG -m image/jpeg -i f00000010_tif_original 'images/00000010.tif.original.jpg'#015
2024-02-23T17:39:50.006037044Z Feb 23 17:39:49 ocrd-manager process_images.sh: 17:39:49.257 INFO ocrd-import - adding -g p0003 -G OCR-D-IMG -m image/jpeg -i f00000011_tif_original 'images/00000011.tif.original.jpg'#015
2024-02-23T17:39:54.006668326Z Feb 23 17:39:53 ocrd-manager process_images.sh: 17:39:53.540 INFO ocrd-import - adding -g p0004 -G OCR-D-IMG -m image/jpeg -i f00000012_tif_original 'images/00000012.tif.original.jpg'#015
2024-02-23T17:39:58.007133350Z Feb 23 17:39:57 ocrd-manager process_images.sh: 17:39:57.962 INFO ocrd-import - adding -g p0005 -G OCR-D-IMG -m image/jpeg -i f00000013_tif_original 'images/00000013.tif.original.jpg'#015
2024-02-23T17:40:03.007542176Z Feb 23 17:40:02 ocrd-manager process_images.sh: 17:40:02.234 INFO ocrd-import - adding -g p0006 -G OCR-D-IMG -m image/jpeg -i f00000014_tif_original 'images/00000014.tif.original.jpg'#015
2024-02-23T17:40:05.007814476Z Feb 23 17:40:04 ocrd-manager process_images.sh: 17:40:04.329 WARNING ocrd-import - converting 'ocrd.pid' to 'OCR-D-IMG/ocrd_*.tif' prior to import#015
2024-02-23T17:40:05.007833500Z Feb 23 17:40:04 ocrd-manager process_images.sh: convert-im6.q16: no decode delegate for this image format `PID' @ error/constitute.c/ReadImage/560.#015
2024-02-23T17:40:05.007836843Z Feb 23 17:40:04 ocrd-manager process_images.sh: convert-im6.q16: no images defined `OCR-D-IMG/ocrd_%04d.tif' @ error/convert.c/ConvertImageCommand/3258.#015
2024-02-23T17:40:07.008328248Z Feb 23 17:40:06 ocrd-manager process_images.sh: 17:40:06.557 WARNING ocrd-import - unknown type of file 'ocrd.pid'#015
2024-02-23T17:40:07.008386281Z Feb 23 17:40:06 ocrd-manager process_images.sh: 17:40:06.843 INFO ocrd.cli.workspace.bulk-add - [   1/6] OCR-D-IMG image/jpeg p0001 f00000009_tif_original images/00000009.tif.original.jpg#015
2024-02-23T17:40:07.008399735Z Feb 23 17:40:06 ocrd-manager process_images.sh: 17:40:06.852 INFO ocrd.cli.workspace.bulk-add - [   2/6] OCR-D-IMG image/jpeg p0002 f00000010_tif_original images/00000010.tif.original.jpg#015
2024-02-23T17:40:07.008408520Z Feb 23 17:40:06 ocrd-manager process_images.sh: 17:40:06.853 INFO ocrd.cli.workspace.bulk-add - [   3/6] OCR-D-IMG image/jpeg p0003 f00000011_tif_original images/00000011.tif.original.jpg#015
2024-02-23T17:40:07.008416634Z Feb 23 17:40:06 ocrd-manager process_images.sh: 17:40:06.853 INFO ocrd.cli.workspace.bulk-add - [   4/6] OCR-D-IMG image/jpeg p0004 f00000012_tif_original images/00000012.tif.original.jpg#015
2024-02-23T17:40:07.008424149Z Feb 23 17:40:06 ocrd-manager process_images.sh: 17:40:06.853 INFO ocrd.cli.workspace.bulk-add - [   5/6] OCR-D-IMG image/jpeg p0005 f00000013_tif_original images/00000013.tif.original.jpg#015
2024-02-23T17:40:07.008431502Z Feb 23 17:40:06 ocrd-manager process_images.sh: 17:40:06.854 INFO ocrd.cli.workspace.bulk-add - [   6/6] OCR-D-IMG image/jpeg p0006 f00000014_tif_original images/00000014.tif.original.jpg#015
2024-02-23T17:40:10.008738430Z Feb 23 17:40:09 ocrd-manager process_images.sh: 17:40:09.125 INFO ocrd-import - Success on '.'#015
2024-02-23T17:40:10.008766132Z Feb 23 17:40:09 ocrd-manager process_images.sh: /data/KitodoJob_91_testdata-kitodo$ #015<ck-border --dummy-word\" -I OCR-D-OCR -O FULLTEXT" #015
2024-02-23T17:40:37.012009422Z Feb 23 17:40:23 ocrd-manager process_images.sh: /data/KitodoJob_91_testdata-kitodo$ #015<ck-border --dummy-word\" -I OCR-D-OCR -O FULLTEXT" #015
2024-02-23T17:40:37.012029683Z Feb 23 17:40:36 ocrd-manager process_images.sh: 17:40:36.727 INFO ocrd.task_sequence.run_tasks - Start processing task 'tesserocr-recognize -I OCR-D-IMG -O OCR-D-OCR -p '{"segmentation_level": "region", "model": "frak2021", "dpi": 0, "padding": 0, "textequiv_level": "word", "overwrite_segments": false, "overwrite_text": true, "shrink_polygons": false, "block_polygons": false, "find_tables": true, "find_staves": false, "sparse_text": false, "raw_lines": false, "char_whitelist": "", "char_blacklist": "", "char_unblacklist": "", "tesseract_parameters": {}, "xpath_parameters": {}, "xpath_model": {}, "auto_model": false, "oem": "DEFAULT"}''#015
2024-02-23T17:40:39.012485331Z Feb 23 17:40:38 ocrd-manager process_images.sh: 17:40:38.556 INFO processor.TesserocrRecognize - Using model 'frak2021' in /models/ocrd-resources/ocrd-tesserocr-recognize/ for recognition at the word level#015
2024-02-23T17:40:39.012509370Z Feb 23 17:40:38 ocrd-manager process_images.sh: 17:40:38.606 INFO processor.TesserocrRecognize - INPUT FILE 0 / p0001#015
2024-02-23T17:40:39.012512864Z Feb 23 17:40:38 ocrd-manager process_images.sh: 17:40:38.683 INFO processor.TesserocrRecognize - Page 'p0001' images will use 300 DPI from image meta-data#015
2024-02-23T17:40:39.012515490Z Feb 23 17:40:38 ocrd-manager process_images.sh: 17:40:38.684 INFO processor.TesserocrRecognize - Processing page 'p0001'#015
2024-02-23T17:40:40.012524426Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.552 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-OCR_p0001.IMG-BIN, file_grp: OCR-D-OCR, path: OCR-D-OCR/OCR-D-OCR_p0001.IMG-BIN.png#015
2024-02-23T17:40:40.012542302Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.556 INFO processor.TesserocrRecognize - Detected region 'region0000': 732,859 782,859 782,877 732,877 (CAPTION_TEXT)#015
2024-02-23T17:40:40.012545869Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.609 INFO processor.TesserocrRecognize - Detected line 'region0000_line0000': 732,859 782,859 782,877 732,877#015
2024-02-23T17:40:40.012548271Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.612 INFO processor.TesserocrRecognize - Detected region 'region0001': 466,949 1063,949 1063,1028 466,1028 (CAPTION_TEXT)#015
2024-02-23T17:40:40.012550513Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.612 INFO processor.TesserocrRecognize - Detected line 'region0001_line0000': 466,949 1063,949 1063,1028 466,1028#015
2024-02-23T17:40:40.012552749Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.709 INFO processor.TesserocrRecognize - Detected region 'region0002': 357,1126 1166,1126 1166,1179 357,1179 (FLOWING_TEXT)#015
2024-02-23T17:40:40.012554943Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.711 INFO processor.TesserocrRecognize - Detected line 'region0002_line0000': 357,1126 1166,1126 1166,1179 357,1179#015
2024-02-23T17:40:40.012557066Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.715 INFO processor.TesserocrRecognize - Detected region 'region0003': 494,1564 1039,1564 1039,1768 494,1768 (FLOWING_TEXT)#015
2024-02-23T17:40:40.012559149Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.716 INFO processor.TesserocrRecognize - Detected line 'region0003_line0000': 683,1564 838,1564 838,1607 683,1607#015
2024-02-23T17:40:40.012561314Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.716 INFO processor.TesserocrRecognize - Detected line 'region0003_line0001': 494,1635 1039,1635 1039,1688 494,1688#015
2024-02-23T17:40:40.012563557Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.718 INFO processor.TesserocrRecognize - Detected line 'region0003_line0002': 648,1715 863,1715 863,1768 648,1768#015
2024-02-23T17:40:40.012567962Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.728 INFO processor.TesserocrRecognize - Detected region 'region0004': 254,0 1564,0 1564,2280 254,2280 (PULLOUT_IMAGE)#015
2024-02-23T17:40:40.012570649Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.729 INFO processor.TesserocrRecognize - INPUT FILE 1 / p0002#015
2024-02-23T17:40:40.012573026Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.893 INFO processor.TesserocrRecognize - Page 'p0002' images will use 300 DPI from image meta-data#015
2024-02-23T17:40:40.012575397Z Feb 23 17:40:39 ocrd-manager process_images.sh: 17:40:39.893 INFO processor.TesserocrRecognize - Processing page 'p0002'#015
2024-02-23T17:40:41.012739038Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.780 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-OCR_p0002.IMG-BIN, file_grp: OCR-D-OCR, path: OCR-D-OCR/OCR-D-OCR_p0002.IMG-BIN.png#015
2024-02-23T17:40:41.012757514Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.782 INFO processor.TesserocrRecognize - Detected region 'region0000': 0,0 122,0 122,2280 0,2280 (FLOWING_IMAGE)#015
2024-02-23T17:40:41.012772160Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.810 INFO processor.TesserocrRecognize - Detected region 'region0001': 334,895 1305,895 1305,1164 334,1164 (FLOWING_TEXT)#015
2024-02-23T17:40:41.012776075Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.810 INFO processor.TesserocrRecognize - Detected line 'region0001_line0000': 334,895 1305,895 1305,980 334,980#015
2024-02-23T17:40:41.012779078Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.911 INFO processor.TesserocrRecognize - Detected line 'region0001_line0001': 335,992 1303,992 1303,1038 335,1038#015
2024-02-23T17:40:41.012782217Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.913 INFO processor.TesserocrRecognize - Detected line 'region0001_line0002': 335,1055 1304,1055 1304,1101 335,1101#015
2024-02-23T17:40:41.012785069Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.914 INFO processor.TesserocrRecognize - Detected line 'region0001_line0003': 539,1119 1102,1119 1102,1164 539,1164#015
2024-02-23T17:40:41.012787904Z Feb 23 17:40:40 ocrd-manager process_images.sh: 17:40:40.916 INFO processor.TesserocrRecognize - INPUT FILE 2 / p0003#015
2024-02-23T17:40:42.012921896Z Feb 23 17:40:41 ocrd-manager process_images.sh: 17:40:41.040 INFO processor.TesserocrRecognize - Page 'p0003' images will use 300 DPI from image meta-data#015
2024-02-23T17:40:42.012944639Z Feb 23 17:40:41 ocrd-manager process_images.sh: 17:40:41.041 INFO processor.TesserocrRecognize - Processing page 'p0003'#015
2024-02-23T17:40:43.012950988Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.426 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-OCR_p0003.IMG-BIN, file_grp: OCR-D-OCR, path: OCR-D-OCR/OCR-D-OCR_p0003.IMG-BIN.png#015
2024-02-23T17:40:43.012981836Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.428 INFO processor.TesserocrRecognize - Detected region 'region0000': 507,58 581,58 581,88 507,88 (FLOWING_IMAGE)#015
2024-02-23T17:40:43.012986766Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.429 INFO processor.TesserocrRecognize - Detected region 'region0001': 678,267 1242,267 1242,302 678,302 (FLOWING_TEXT)#015
2024-02-23T17:40:43.012989760Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.429 INFO processor.TesserocrRecognize - Detected line 'region0001_line0000': 678,267 1242,267 1242,302 678,302#015
2024-02-23T17:40:43.012992612Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.431 INFO processor.TesserocrRecognize - Detected region 'region0002': 275,304 1243,304 1243,315 275,315 (HORZ_LINE)#015
2024-02-23T17:40:43.012995106Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.431 INFO processor.TesserocrRecognize - Detected region 'region0003': 621,671 889,671 889,724 621,724 (FLOWING_TEXT)#015
2024-02-23T17:40:43.012997686Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.432 INFO processor.TesserocrRecognize - Detected line 'region0003_line0000': 621,671 889,671 889,724 621,724#015
2024-02-23T17:40:43.013000353Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.509 INFO processor.TesserocrRecognize - Detected region 'region0004': 268,798 1244,798 1244,1861 268,1861 (FLOWING_TEXT)#015
2024-02-23T17:40:43.013016480Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.511 INFO processor.TesserocrRecognize - Detected line 'region0004_line0000': 276,798 1244,798 1244,875 276,875#015
2024-02-23T17:40:43.013020511Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.615 INFO processor.TesserocrRecognize - Detected line 'region0004_line0001': 273,892 1241,892 1241,938 273,938#015
2024-02-23T17:40:43.013023129Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.619 INFO processor.TesserocrRecognize - Detected line 'region0004_line0002': 272,954 1241,954 1241,999 272,999#015
2024-02-23T17:40:43.013027151Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.621 INFO processor.TesserocrRecognize - Detected line 'region0004_line0003': 272,1016 1238,1016 1238,1061 272,1061#015
2024-02-23T17:40:43.013029840Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.623 INFO processor.TesserocrRecognize - Detected line 'region0004_line0004': 272,1078 1238,1078 1238,1123 272,1123#015
2024-02-23T17:40:43.013033133Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.624 INFO processor.TesserocrRecognize - Detected line 'region0004_line0005': 272,1140 1237,1140 1237,1185 272,1185#015
2024-02-23T17:40:43.013035616Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.626 INFO processor.TesserocrRecognize - Detected line 'region0004_line0006': 271,1201 1237,1201 1237,1248 271,1248#015
2024-02-23T17:40:43.013038394Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.627 INFO processor.TesserocrRecognize - Detected line 'region0004_line0007': 272,1263 1238,1263 1238,1306 272,1306#015
2024-02-23T17:40:43.013040825Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.629 INFO processor.TesserocrRecognize - Detected line 'region0004_line0008': 271,1324 1236,1324 1236,1373 271,1373#015
2024-02-23T17:40:43.013043382Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.631 INFO processor.TesserocrRecognize - Detected line 'region0004_line0009': 271,1387 1236,1387 1236,1434 271,1434#015
2024-02-23T17:40:43.013046272Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.633 INFO processor.TesserocrRecognize - Detected line 'region0004_line0010': 269,1448 1236,1448 1236,1493 269,1493#015
2024-02-23T17:40:43.013048751Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.635 INFO processor.TesserocrRecognize - Detected line 'region0004_line0011': 270,1511 1235,1511 1235,1557 270,1557#015
2024-02-23T17:40:43.013051506Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.710 INFO processor.TesserocrRecognize - Detected line 'region0004_line0012': 269,1573 1236,1573 1236,1621 269,1621#015
2024-02-23T17:40:43.013053915Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.716 INFO processor.TesserocrRecognize - Detected line 'region0004_line0013': 268,1636 1236,1636 1236,1683 268,1683#015
2024-02-23T17:40:43.013056462Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.724 INFO processor.TesserocrRecognize - Detected line 'region0004_line0014': 268,1697 1236,1697 1236,1743 268,1743#015
2024-02-23T17:40:43.013059302Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.738 INFO processor.TesserocrRecognize - Detected line 'region0004_line0015': 268,1758 1234,1758 1234,1807 268,1807#015
2024-02-23T17:40:43.013061957Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.740 INFO processor.TesserocrRecognize - Detected line 'region0004_line0016': 268,1820 401,1820 401,1861 268,1861#015
2024-02-23T17:40:43.013068061Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.811 INFO processor.TesserocrRecognize - Detected region 'region0005': 1468,0 1564,0 1564,2280 1468,2280 (FLOWING_IMAGE)#015
2024-02-23T17:40:43.013071489Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.814 INFO processor.TesserocrRecognize - INPUT FILE 3 / p0004#015
2024-02-23T17:40:43.013074227Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.908 INFO processor.TesserocrRecognize - Page 'p0004' images will use 300 DPI from image meta-data#015
2024-02-23T17:40:43.013076939Z Feb 23 17:40:42 ocrd-manager process_images.sh: 17:40:42.908 INFO processor.TesserocrRecognize - Processing page 'p0004'#015
2024-02-23T17:40:45.013527572Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.693 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-OCR_p0004.IMG-BIN, file_grp: OCR-D-OCR, path: OCR-D-OCR/OCR-D-OCR_p0004.IMG-BIN.png#015
2024-02-23T17:40:45.013597856Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.696 INFO processor.TesserocrRecognize - Detected region 'region0000': 0,0 154,0 154,2280 0,2280 (FLOWING_IMAGE)#015
2024-02-23T17:40:45.013608608Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.706 INFO processor.TesserocrRecognize - Detected region 'region0001': 333,254 913,254 913,289 333,289 (FLOWING_TEXT)#015
2024-02-23T17:40:45.013617840Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.708 INFO processor.TesserocrRecognize - Detected line 'region0001_line0000': 333,254 913,254 913,289 333,289#015
2024-02-23T17:40:45.013625319Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.810 INFO processor.TesserocrRecognize - Detected region 'region0002': 336,286 1305,286 1305,304 336,304 (HORZ_LINE)#015
2024-02-23T17:40:45.013632199Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.813 INFO processor.TesserocrRecognize - Detected region 'region0003': 334,358 1316,358 1316,1103 334,1103 (FLOWING_TEXT)#015
2024-02-23T17:40:45.013639158Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.813 INFO processor.TesserocrRecognize - Detected line 'region0003_line0000': 416,358 1307,358 1307,411 416,411#015
2024-02-23T17:40:45.013646455Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.815 INFO processor.TesserocrRecognize - Detected line 'region0003_line0001': 335,421 1309,421 1309,477 335,477#015
2024-02-23T17:40:45.013653297Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.817 INFO processor.TesserocrRecognize - Detected line 'region0003_line0002': 334,485 1309,485 1309,539 334,539#015
2024-02-23T17:40:45.013659974Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.818 INFO processor.TesserocrRecognize - Detected line 'region0003_line0003': 337,548 1310,548 1310,601 337,601#015
2024-02-23T17:40:45.013666774Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.820 INFO processor.TesserocrRecognize - Detected line 'region0003_line0004': 337,612 1312,612 1312,662 337,662#015
2024-02-23T17:40:45.013676602Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.821 INFO processor.TesserocrRecognize - Detected line 'region0003_line0005': 337,672 1312,672 1312,721 337,721#015
2024-02-23T17:40:45.013684516Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.823 INFO processor.TesserocrRecognize - Detected line 'region0003_line0006': 338,735 1312,735 1312,790 338,790#015
2024-02-23T17:40:45.013727261Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.825 INFO processor.TesserocrRecognize - Detected line 'region0003_line0007': 340,798 1314,798 1314,853 340,853#015
2024-02-23T17:40:45.013736629Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.827 INFO processor.TesserocrRecognize - Detected line 'region0003_line0008': 340,858 1313,858 1313,916 340,916#015
2024-02-23T17:40:45.013744291Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.829 INFO processor.TesserocrRecognize - Detected line 'region0003_line0009': 341,924 1314,924 1314,971 341,971#015
2024-02-23T17:40:45.013751883Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.830 INFO processor.TesserocrRecognize - Detected line 'region0003_line0010': 342,986 1316,986 1316,1041 342,1041#015
2024-02-23T17:40:45.013759317Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.832 INFO processor.TesserocrRecognize - Detected line 'region0003_line0011': 345,1051 1252,1051 1252,1103 345,1103#015
2024-02-23T17:40:45.013768181Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.835 INFO processor.TesserocrRecognize - Detected region 'region0004': 347,1121 1326,1121 1326,1862 347,1862 (FLOWING_TEXT)#015
2024-02-23T17:40:45.013778933Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.835 INFO processor.TesserocrRecognize - Detected line 'region0004_line0000': 429,1121 1316,1121 1316,1176 429,1176#015
2024-02-23T17:40:45.013786523Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.911 INFO processor.TesserocrRecognize - Detected line 'region0004_line0001': 347,1184 1316,1184 1316,1241 347,1241#015
2024-02-23T17:40:45.013794362Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.913 INFO processor.TesserocrRecognize - Detected line 'region0004_line0002': 347,1248 1318,1248 1318,1303 347,1303#015
2024-02-23T17:40:45.013801771Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.915 INFO processor.TesserocrRecognize - Detected line 'region0004_line0003': 347,1305 1320,1305 1320,1363 347,1363#015
2024-02-23T17:40:45.013812482Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.917 INFO processor.TesserocrRecognize - Detected line 'region0004_line0004': 348,1375 1319,1375 1319,1428 348,1428#015
2024-02-23T17:40:45.013824992Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.919 INFO processor.TesserocrRecognize - Detected line 'region0004_line0005': 350,1434 1321,1434 1321,1491 350,1491#015
2024-02-23T17:40:45.013834200Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.920 INFO processor.TesserocrRecognize - Detected line 'region0004_line0006': 350,1498 1322,1498 1322,1553 350,1553#015
2024-02-23T17:40:45.013844309Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.922 INFO processor.TesserocrRecognize - Detected line 'region0004_line0007': 351,1561 1322,1561 1322,1616 351,1616#015
2024-02-23T17:40:45.013855983Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.924 INFO processor.TesserocrRecognize - Detected line 'region0004_line0008': 352,1623 1324,1623 1324,1673 352,1673#015
2024-02-23T17:40:45.013863636Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.926 INFO processor.TesserocrRecognize - Detected line 'region0004_line0009': 352,1685 1323,1685 1323,1736 352,1736#015
2024-02-23T17:40:45.013871353Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.928 INFO processor.TesserocrRecognize - Detected line 'region0004_line0010': 352,1747 1325,1747 1325,1798 352,1798#015
2024-02-23T17:40:45.013893445Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.930 INFO processor.TesserocrRecognize - Detected line 'region0004_line0011': 354,1810 1326,1810 1326,1862 354,1862#015
2024-02-23T17:40:45.013902608Z Feb 23 17:40:44 ocrd-manager process_images.sh: 17:40:44.935 INFO processor.TesserocrRecognize - INPUT FILE 4 / p0005#015
2024-02-23T17:40:46.013598583Z Feb 23 17:40:45 ocrd-manager process_images.sh: 17:40:45.109 INFO processor.TesserocrRecognize - Page 'p0005' images will use 300 DPI from image meta-data#015
2024-02-23T17:40:46.013619323Z Feb 23 17:40:45 ocrd-manager process_images.sh: 17:40:45.109 INFO processor.TesserocrRecognize - Processing page 'p0005'#015
2024-02-23T17:40:47.013695887Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.395 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-OCR_p0005.IMG-BIN, file_grp: OCR-D-OCR, path: OCR-D-OCR/OCR-D-OCR_p0005.IMG-BIN.png#015
2024-02-23T17:40:47.013723303Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.396 INFO processor.TesserocrRecognize - Detected region 'region0000': 657,290 1234,290 1234,324 657,324 (FLOWING_TEXT)#015
2024-02-23T17:40:47.013727050Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.406 INFO processor.TesserocrRecognize - Detected line 'region0000_line0000': 657,290 1234,290 1234,324 657,324#015
2024-02-23T17:40:47.013729857Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.407 INFO processor.TesserocrRecognize - Detected region 'region0001': 269,324 1235,324 1235,338 269,338 (HORZ_LINE)#015
2024-02-23T17:40:47.013732659Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.506 INFO processor.TesserocrRecognize - Detected region 'region0002': 263,396 1234,396 1234,1259 263,1259 (FLOWING_TEXT)#015
2024-02-23T17:40:47.013735126Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.508 INFO processor.TesserocrRecognize - Detected line 'region0002_line0000': 267,396 1234,396 1234,446 267,446#015
2024-02-23T17:40:47.013737653Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.512 INFO processor.TesserocrRecognize - Detected line 'region0002_line0001': 264,459 1234,459 1234,509 264,509#015
2024-02-23T17:40:47.013740472Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.513 INFO processor.TesserocrRecognize - Detected line 'region0002_line0002': 265,521 1231,521 1231,571 265,571#015
2024-02-23T17:40:47.013743477Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.514 INFO processor.TesserocrRecognize - Detected line 'region0002_line0003': 264,583 1232,583 1232,633 264,633#015
2024-02-23T17:40:47.013745955Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.516 INFO processor.TesserocrRecognize - Detected line 'region0002_line0004': 264,646 1232,646 1232,693 264,693#015
2024-02-23T17:40:47.013748533Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.517 INFO processor.TesserocrRecognize - Detected line 'region0002_line0005': 264,708 1232,708 1232,755 264,755#015
2024-02-23T17:40:47.013752841Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.519 INFO processor.TesserocrRecognize - Detected line 'region0002_line0006': 263,770 1232,770 1232,820 263,820#015
2024-02-23T17:40:47.013770292Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.521 INFO processor.TesserocrRecognize - Detected line 'region0002_line0007': 263,832 873,832 873,873 263,873#015
2024-02-23T17:40:47.013774253Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.522 INFO processor.TesserocrRecognize - Detected line 'region0002_line0008': 346,904 1229,904 1229,951 346,951#015
2024-02-23T17:40:47.013776728Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.524 INFO processor.TesserocrRecognize - Detected line 'region0002_line0009': 264,965 1228,965 1228,1016 264,1016#015
2024-02-23T17:40:47.013779259Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.525 INFO processor.TesserocrRecognize - Detected line 'region0002_line0010': 264,1027 1226,1027 1226,1083 264,1083#015
2024-02-23T17:40:47.013781824Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.527 INFO processor.TesserocrRecognize - Detected line 'region0002_line0011': 263,1090 1228,1090 1228,1141 263,1141#015
2024-02-23T17:40:47.013784167Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.528 INFO processor.TesserocrRecognize - Detected line 'region0002_line0012': 264,1151 1226,1151 1226,1200 264,1200#015
2024-02-23T17:40:47.013786625Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.530 INFO processor.TesserocrRecognize - Detected line 'region0002_line0013': 265,1212 1058,1212 1058,1259 265,1259#015
2024-02-23T17:40:47.013789009Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.531 INFO processor.TesserocrRecognize - Detected region 'region0003': 748,1333 1176,1333 1176,1384 748,1384 (FLOWING_TEXT)#015
2024-02-23T17:40:47.013791509Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.532 INFO processor.TesserocrRecognize - Detected line 'region0003_line0000': 748,1333 1176,1333 1176,1384 748,1384#015
2024-02-23T17:40:47.013793926Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.533 INFO processor.TesserocrRecognize - Detected region 'region0004': 1478,0 1564,0 1564,2280 1478,2280 (FLOWING_IMAGE)#015
2024-02-23T17:40:47.013796491Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.611 INFO processor.TesserocrRecognize - INPUT FILE 5 / p0006#015
2024-02-23T17:40:47.013799230Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.720 INFO processor.TesserocrRecognize - Page 'p0006' images will use 300 DPI from image meta-data#015
2024-02-23T17:40:47.013802047Z Feb 23 17:40:46 ocrd-manager process_images.sh: 17:40:46.720 INFO processor.TesserocrRecognize - Processing page 'p0006'#015
2024-02-23T17:40:49.014288903Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.013 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-OCR_p0006.IMG-BIN, file_grp: OCR-D-OCR, path: OCR-D-OCR/OCR-D-OCR_p0006.IMG-BIN.png#015
2024-02-23T17:40:49.014352253Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.016 INFO processor.TesserocrRecognize - Detected region 'region0000': 346,275 964,275 964,322 346,322 (FLOWING_TEXT)#015
2024-02-23T17:40:49.014362689Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.016 INFO processor.TesserocrRecognize - Detected line 'region0000_line0000': 346,275 964,275 964,322 346,322#015
2024-02-23T17:40:49.014369898Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.018 INFO processor.TesserocrRecognize - Detected region 'region0001': 344,306 1313,306 1313,328 344,328 (HORZ_LINE)#015
2024-02-23T17:40:49.014403959Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.019 INFO processor.TesserocrRecognize - Detected region 'region0002': 634,451 1030,451 1030,511 634,511 (FLOWING_TEXT)#015
2024-02-23T17:40:49.014412236Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.019 INFO processor.TesserocrRecognize - Detected line 'region0002_line0000': 634,451 1030,451 1030,511 634,511#015
2024-02-23T17:40:49.014418865Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.022 INFO processor.TesserocrRecognize - Detected region 'region0003': 0,0 158,0 158,2280 0,2280 (FLOWING_IMAGE)#015
2024-02-23T17:40:49.014425152Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.023 INFO processor.TesserocrRecognize - Detected region 'region0004': 350,618 480,618 480,646 350,646 (FLOWING_TEXT)#015
2024-02-23T17:40:49.014431469Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.023 INFO processor.TesserocrRecognize - Detected line 'region0004_line0000': 350,618 480,618 480,646 350,646#015
2024-02-23T17:40:49.014437539Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.024 INFO processor.TesserocrRecognize - Detected region 'region0005': 352,630 890,630 890,707 352,707 (HEADING_TEXT)#015
2024-02-23T17:40:49.014443711Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.106 INFO processor.TesserocrRecognize - Detected line 'region0005_line0000': 352,630 890,630 890,707 352,707#015
2024-02-23T17:40:49.014452999Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.108 INFO processor.TesserocrRecognize - Detected region 'region0006': 353,712 701,712 701,795 353,795 (FLOWING_TEXT)#015
2024-02-23T17:40:49.014459511Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.109 INFO processor.TesserocrRecognize - Detected line 'region0006_line0000': 353,712 627,712 627,749 353,749#015
2024-02-23T17:40:49.014465865Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.111 INFO processor.TesserocrRecognize - Detected line 'region0006_line0001': 354,756 701,756 701,795 354,795#015
2024-02-23T17:40:49.014472412Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.209 INFO processor.TesserocrRecognize - Detected region 'region0007': 355,800 762,800 762,837 355,837 (HEADING_TEXT)#015
2024-02-23T17:40:49.014478546Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.211 INFO processor.TesserocrRecognize - Detected line 'region0007_line0000': 355,800 762,800 762,837 355,837#015
2024-02-23T17:40:49.014484692Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.213 INFO processor.TesserocrRecognize - Detected region 'region0008': 355,847 717,847 717,884 355,884 (FLOWING_TEXT)#015
2024-02-23T17:40:49.014490851Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.214 INFO processor.TesserocrRecognize - Detected line 'region0008_line0000': 355,847 717,847 717,884 355,884#015
2024-02-23T17:40:49.014496937Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.216 INFO processor.TesserocrRecognize - Detected region 'region0009': 356,860 1028,860 1028,928 356,928 (PULLOUT_TEXT)#015
2024-02-23T17:40:49.014503181Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.216 INFO processor.TesserocrRecognize - Detected line 'region0009_line0000': 356,860 1028,860 1028,928 356,928#015
2024-02-23T17:40:49.014509055Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.220 INFO processor.TesserocrRecognize - Detected region 'region0010': 357,939 711,939 711,1200 357,1200 (FLOWING_TEXT)#015
2024-02-23T17:40:49.014524775Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.220 INFO processor.TesserocrRecognize - Detected line 'region0010_line0000': 357,939 711,939 711,974 357,974#015
2024-02-23T17:40:49.014531988Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.221 INFO processor.TesserocrRecognize - Detected line 'region0010_line0001': 357,980 701,980 701,1017 357,1017#015
2024-02-23T17:40:49.014539639Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.222 INFO processor.TesserocrRecognize - Detected line 'region0010_line0002': 359,1030 670,1030 670,1065 359,1065#015
2024-02-23T17:40:49.014546520Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.223 INFO processor.TesserocrRecognize - Detected line 'region0010_line0003': 358,1076 626,1076 626,1110 358,1110#015
2024-02-23T17:40:49.014553582Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.224 INFO processor.TesserocrRecognize - Detected line 'region0010_line0004': 360,1118 624,1118 624,1155 360,1155#015
2024-02-23T17:40:49.015124218Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.226 INFO processor.TesserocrRecognize - Detected line 'region0010_line0005': 362,1161 663,1161 663,1200 362,1200#015
2024-02-23T17:40:49.015221514Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.228 INFO processor.TesserocrRecognize - Detected region 'region0011': 362,1205 762,1205 762,1244 362,1244 (HEADING_TEXT)#015
2024-02-23T17:40:49.015233540Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.228 INFO processor.TesserocrRecognize - Detected line 'region0011_line0000': 362,1205 762,1205 762,1244 362,1244#015
2024-02-23T17:40:49.015241804Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.230 INFO processor.TesserocrRecognize - Detected region 'region0012': 363,1254 723,1254 723,1468 363,1468 (FLOWING_TEXT)#015
2024-02-23T17:40:49.015248957Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.231 INFO processor.TesserocrRecognize - Detected line 'region0012_line0000': 363,1254 720,1254 720,1287 363,1287#015
2024-02-23T17:40:49.015255630Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.232 INFO processor.TesserocrRecognize - Detected line 'region0012_line0001': 364,1296 708,1296 708,1335 364,1335#015
2024-02-23T17:40:49.015262127Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.233 INFO processor.TesserocrRecognize - Detected line 'region0012_line0002': 364,1340 720,1340 720,1377 364,1377#015
2024-02-23T17:40:49.015268812Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.234 INFO processor.TesserocrRecognize - Detected line 'region0012_line0003': 364,1389 720,1389 720,1417 364,1417#015
2024-02-23T17:40:49.015275032Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.307 INFO processor.TesserocrRecognize - Detected line 'region0012_line0004': 366,1432 723,1432 723,1468 366,1468#015
2024-02-23T17:40:49.015281370Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.312 INFO processor.TesserocrRecognize - Detected region 'region0013': 367,1447 949,1447 949,1557 367,1557 (HEADING_TEXT)#015
2024-02-23T17:40:49.015287655Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.313 INFO processor.TesserocrRecognize - Detected line 'region0013_line0000': 368,1447 903,1447 903,1513 368,1513#015
2024-02-23T17:40:49.015294050Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.315 INFO processor.TesserocrRecognize - Detected line 'region0013_line0001': 367,1518 949,1518 949,1557 367,1557#015
2024-02-23T17:40:49.015316774Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.317 INFO processor.TesserocrRecognize - Detected region 'region0014': 368,1569 516,1569 516,1596 368,1596 (FLOWING_TEXT)#015
2024-02-23T17:40:49.015328283Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.318 INFO processor.TesserocrRecognize - Detected line 'region0014_line0000': 368,1569 516,1569 516,1596 368,1596#015
2024-02-23T17:40:49.015334864Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.319 INFO processor.TesserocrRecognize - Detected region 'region0015': 479,1604 1027,1604 1027,1689 479,1689 (PULLOUT_TEXT)#015
2024-02-23T17:40:49.015340947Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.319 INFO processor.TesserocrRecognize - Detected line 'region0015_line0000': 514,1604 1026,1604 1026,1640 514,1640#015
2024-02-23T17:40:49.015346918Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.322 INFO processor.TesserocrRecognize - Detected line 'region0015_line0001': 479,1649 1027,1649 1027,1689 479,1689#015
2024-02-23T17:40:49.015353148Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.325 INFO processor.TesserocrRecognize - Detected region 'region0016': 749,1768 965,1768 965,1822 749,1822 (FLOWING_TEXT)#015
2024-02-23T17:40:49.015359290Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.326 INFO processor.TesserocrRecognize - Detected line 'region0016_line0000': 749,1768 965,1768 965,1822 749,1822#015
2024-02-23T17:40:49.015368842Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.795 INFO ocrd.task_sequence.run_tasks - Finished processing task 'tesserocr-recognize -I OCR-D-IMG -O OCR-D-OCR -p '{"segmentation_level": "region", "model": "frak2021", "dpi": 0, "padding": 0, "textequiv_level": "word", "overwrite_segments": false, "overwrite_text": true, "shrink_polygons": false, "block_polygons": false, "find_tables": true, "find_staves": false, "sparse_text": false, "raw_lines": false, "char_whitelist": "", "char_blacklist": "", "char_unblacklist": "", "tesseract_parameters": {}, "xpath_parameters": {}, "xpath_model": {}, "auto_model": false, "oem": "DEFAULT"}''#015
2024-02-23T17:40:49.015384562Z Feb 23 17:40:48 ocrd-manager process_images.sh: 17:40:48.795 INFO ocrd.task_sequence.run_tasks - Start processing task 'fileformat-transform -I OCR-D-OCR -O FULLTEXT -p '{"from-to": "page alto", "script-args": "--no-check-border --dummy-word", "ext": ""}''#015
2024-02-23T17:41:44.021769800Z Feb 23 17:41:43 ocrd-manager process_images.sh: 17:41:43.028 INFO ocrd-fileformat-transform - page --> alto: input file OCR-D-OCR_p0004 (p0004)#015
2024-02-23T17:41:45.021765996Z Feb 23 17:41:44 ocrd-manager process_images.sh: 17:41:44.924 INFO ocrd-fileformat-transform - page --> alto: input file OCR-D-OCR_p0002 (p0002)#015
2024-02-23T17:41:46.021892392Z Feb 23 17:41:45 ocrd-manager process_images.sh: 17:41:45.919 INFO ocrd-fileformat-transform - page --> alto: input file OCR-D-OCR_p0001 (p0001)#015
2024-02-23T17:41:47.022207668Z Feb 23 17:41:46 ocrd-manager process_images.sh: 17:41:46.819 INFO ocrd-fileformat-transform - page --> alto: input file OCR-D-OCR_p0003 (p0003)#015
2024-02-23T17:41:59.023698340Z Feb 23 17:41:58 ocrd-manager process_images.sh: 17:41:58.821 WARNING page-to-alto - PAGE-XML has neither Border nor PrintSpace - PrintSpace will fill the image#015
2024-02-23T17:42:01.023873329Z Feb 23 17:42:00 ocrd-manager process_images.sh: 17:42:00.113 WARNING page-to-alto - PAGE-XML has neither Border nor PrintSpace - PrintSpace will fill the image#015
2024-02-23T17:42:01.023905136Z Feb 23 17:42:00 ocrd-manager process_images.sh: 17:42:00.921 WARNING page-to-alto - PAGE-XML has neither Border nor PrintSpace - PrintSpace will fill the image#015
2024-02-23T17:42:02.023992883Z Feb 23 17:42:01 ocrd-manager process_images.sh: 17:42:01.530 WARNING page-to-alto - PAGE-XML has neither Border nor PrintSpace - PrintSpace will fill the image#015
2024-02-23T17:42:15.025091442Z Feb 23 17:42:14 ocrd-manager process_images.sh: 17:42:14.417 INFO ocrd-fileformat-transform - Successfully executed: ocr-transform page alto OCR-D-OCR/OCR-D-OCR_p0001.xml FULLTEXT/FULLTEXT_p0001.xml -- --no-check-border --dummy-word#015
2024-02-23T17:42:15.025116538Z Feb 23 17:42:14 ocrd-manager process_images.sh: 17:42:14.922 INFO ocrd-fileformat-transform - Successfully executed: ocr-transform page alto OCR-D-OCR/OCR-D-OCR_p0004.xml FULLTEXT/FULLTEXT_p0004.xml -- --no-check-border --dummy-word#015
2024-02-23T17:42:16.025250517Z Feb 23 17:42:16 ocrd-manager process_images.sh: 17:42:16.022 INFO ocrd-fileformat-transform - Successfully executed: ocr-transform page alto OCR-D-OCR/OCR-D-OCR_p0002.xml FULLTEXT/FULLTEXT_p0002.xml -- --no-check-border --dummy-word#015
2024-02-23T17:42:17.025245182Z Feb 23 17:42:16 ocrd-manager process_images.sh: 17:42:16.616 INFO ocrd-fileformat-transform - Successfully executed: ocr-transform page alto OCR-D-OCR/OCR-D-OCR_p0003.xml FULLTEXT/FULLTEXT_p0003.xml -- --no-check-border --dummy-word#015
2024-02-23T17:42:28.027309542Z Feb 23 17:42:27 ocrd-manager process_images.sh: 17:42:27.638 INFO ocrd-fileformat-transform - page --> alto: input file OCR-D-OCR_p0006 (p0006)#015
2024-02-23T17:42:28.027335273Z Feb 23 17:42:27 ocrd-manager process_images.sh: 17:42:27.823 INFO ocrd-fileformat-transform - page --> alto: input file OCR-D-OCR_p0005 (p0005)#015
2024-02-23T17:42:33.027851074Z Feb 23 17:42:32 ocrd-manager process_images.sh: 17:42:32.907 WARNING page-to-alto - PAGE-XML has neither Border nor PrintSpace - PrintSpace will fill the image#015
2024-02-23T17:42:33.027888757Z Feb 23 17:42:32 ocrd-manager process_images.sh: 17:42:32.933 WARNING page-to-alto - PAGE-XML has neither Border nor PrintSpace - PrintSpace will fill the image#015
2024-02-23T17:42:39.028697937Z Feb 23 17:42:38 ocrd-manager process_images.sh: 17:42:38.108 INFO ocrd-fileformat-transform - Successfully executed: ocr-transform page alto OCR-D-OCR/OCR-D-OCR_p0005.xml FULLTEXT/FULLTEXT_p0005.xml -- --no-check-border --dummy-word#015
2024-02-23T17:42:39.028721437Z Feb 23 17:42:38 ocrd-manager process_images.sh: 17:42:38.117 INFO ocrd-fileformat-transform - Successfully executed: ocr-transform page alto OCR-D-OCR/OCR-D-OCR_p0006.xml FULLTEXT/FULLTEXT_p0006.xml -- --no-check-border --dummy-word#015
2024-02-23T17:42:41.029255211Z Feb 23 17:42:40 ocrd-manager process_images.sh: 17:42:40.749 INFO ocrd.cli.workspace.bulk-add - [   1/6] p0001 FULLTEXT_p0001 FULLTEXT/FULLTEXT_p0001.xml#015
2024-02-23T17:42:41.029284932Z Feb 23 17:42:40 ocrd-manager process_images.sh: 17:42:40.750 INFO ocrd.cli.workspace.bulk-add - [   2/6] p0002 FULLTEXT_p0002 FULLTEXT/FULLTEXT_p0002.xml#015
2024-02-23T17:42:41.029288151Z Feb 23 17:42:40 ocrd-manager process_images.sh: 17:42:40.750 INFO ocrd.cli.workspace.bulk-add - [   3/6] p0003 FULLTEXT_p0003 FULLTEXT/FULLTEXT_p0003.xml#015
2024-02-23T17:42:41.029304665Z Feb 23 17:42:40 ocrd-manager process_images.sh: 17:42:40.750 INFO ocrd.cli.workspace.bulk-add - [   4/6] p0004 FULLTEXT_p0004 FULLTEXT/FULLTEXT_p0004.xml#015
2024-02-23T17:42:41.029307567Z Feb 23 17:42:40 ocrd-manager process_images.sh: 17:42:40.751 INFO ocrd.cli.workspace.bulk-add - [   5/6] p0005 FULLTEXT_p0005 FULLTEXT/FULLTEXT_p0005.xml#015
2024-02-23T17:42:41.029310078Z Feb 23 17:42:40 ocrd-manager process_images.sh: 17:42:40.751 INFO ocrd.cli.workspace.bulk-add - [   6/6] p0006 FULLTEXT_p0006 FULLTEXT/FULLTEXT_p0006.xml#015
2024-02-23T17:42:42.030070403Z Feb 23 17:42:41 ocrd-manager process_images.sh: 17:42:41.068 INFO ocrd.task_sequence.run_tasks - Finished processing task 'fileformat-transform -I OCR-D-OCR -O FULLTEXT -p '{"from-to": "page alto", "script-args": "--no-check-border --dummy-word", "ext": ""}''#015
2024-02-23T17:42:42.030173322Z Feb 23 17:42:41 ocrd-manager process_images.sh: 17:42:41.069 INFO ocrd.cli.process - Finished#015
@markusweigelt markusweigelt changed the title Processing is slow Processing is slow maybe there is some potential for optimizing with simple means Feb 23, 2024
@bertsky
Copy link
Member

bertsky commented Feb 24, 2024

Thanks for spotting!

I digged somewhat deeper (tracing and timing the ocrd-import commands), and found out that the ocrd log calls are the real culprit – see OCR-D/core#1194

I will prepare a workaround by piping to a single ocrd log - subprocess in the background...

@bertsky
Copy link
Member

bertsky commented Feb 24, 2024

The current workflow-configuration/ocrd-import contains the necessary workarounds – it became much faster than before (even without parallel jobs).

So this should be fixed by rebuilding from current workflow-configuration:master (but must really be done on Controller).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants