Skip to content
Change the repository type filter

All

    Repositories list

    • Vue
      Apache License 2.0
      10214Updated Oct 18, 2024Oct 18, 2024
    • core

      Public
      Collection of OCR-related python tools and wrappers from @OCR-D
      Python
      Apache License 2.0
      3111811720Updated Oct 17, 2024Oct 17, 2024
    • ocrd_froc

      Public
      Python
      2771Updated Oct 16, 2024Oct 16, 2024
    • OCR-D wrapper for ocr-fileformat
      Shell
      Apache License 2.0
      3460Updated Oct 16, 2024Oct 16, 2024
    • ocrd_all

      Public
      Master repository which includes most other OCR-D repositories as submodules
      Makefile
      MIT License
      1872256Updated Oct 16, 2024Oct 16, 2024
    • Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
      JavaScript
      MIT License
      23100Updated Oct 11, 2024Oct 11, 2024
    • Convert PAGE (v. 2019) to ALTO (v. 2.0 - 4.2)
      Python
      Apache License 2.0
      513101Updated Oct 10, 2024Oct 10, 2024
    • HTML
      Creative Commons Attribution 4.0 International
      724291Updated Oct 10, 2024Oct 10, 2024
    • Website for OCR-D specs, formats, requirements
      HTML
      2500Updated Oct 10, 2024Oct 10, 2024
    • Recognize text using Calamari OCR and the OCR-D framework
      Python
      Apache License 2.0
      613173Updated Oct 7, 2024Oct 7, 2024
    • Run ImageMagick with an OCR-D CLI
      Shell
      Apache License 2.0
      3520Updated Oct 1, 2024Oct 1, 2024
    • Simple character-based language model using keras
      Python
      Apache License 2.0
      6710Updated Oct 1, 2024Oct 1, 2024
    • Python
      Apache License 2.0
      2100Updated Oct 1, 2024Oct 1, 2024
    • assets

      Public
      Test data for testing specs and software in @OCR-D
      Makefile
      95186Updated Sep 30, 2024Sep 30, 2024
    • Middleware for running Quiver locally
      Python
      0000Updated Sep 24, 2024Sep 24, 2024
    • Benchmarking OCR-D workflows in Docker
      HTML
      MIT License
      1282Updated Sep 20, 2024Sep 20, 2024
    • OCR-D-compliant page segmentation
      Python
      MIT License
      1566102Updated Sep 5, 2024Sep 5, 2024
    • Wrapper for the kraken OCR engine
      Python
      Apache License 2.0
      61131Updated Aug 30, 2024Aug 30, 2024
    • Run tesseract with the tesserocr bindings with @OCR-D's interfaces
      Python
      MIT License
      1138134Updated Aug 21, 2024Aug 21, 2024
    • spec

      Public
      Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)
      Python
      517429Updated Aug 21, 2024Aug 21, 2024
    • The OCR-D Ground Truth text and structure corpus was created between 2015 -2017. In the years since 2017, this corpus has been further curated and supplemented with metadata where appropriate. The corpus includes page XML files within annotations of the text and structure include.
      Creative Commons Attribution Share Alike 4.0 International
      3500Updated Jul 31, 2024Jul 31, 2024
    • Python
      Creative Commons Zero v1.0 Universal
      1300Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_5_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_5_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_5_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_4_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_4_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      1001Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_4_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_3_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024
    • The repo gt_structure_1_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
      Creative Commons Zero v1.0 Universal
      0000Updated Jun 24, 2024Jun 24, 2024