Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix IndexError in para_split_v3.py for empty line handling #916

Closed

Commits on Nov 6, 2024

  1. docs(README): update badges

    myhloli committed Nov 6, 2024
    Configuration menu
    Copy the full SHA
    6b3e142 View commit details
    Browse the repository at this point in the history
  2. Merge pull request #887 from myhloli/dev

    docs(README): update badges
    myhloli authored Nov 6, 2024
    Configuration menu
    Copy the full SHA
    54844a5 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2024

  1. feat(model): add xycut algorithm for block sorting

    - Implement xycut algorithm to sort blocks when layoutreader fails
    - Add recursive_xy_cut function to perform the xycut algorithm- Update pdf_parse_union_core_v2.py to use xycut when layoutreader fails
    - Modify draw_bbox.py to handle cases where layoutreader fails to sort blocks
    myhloli committed Nov 7, 2024
    Configuration menu
    Copy the full SHA
    7d5850e View commit details
    Browse the repository at this point in the history
  2. Merge pull request #898 from myhloli/fix-line-over-512

    feat(model): add xycut algorithm for block sorting
    myhloli authored Nov 7, 2024
    Configuration menu
    Copy the full SHA
    2600d32 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2024

  1. refactor(pdf_parse): adjust line count limit for layoutreader

    - Decrease the maximum line count from 512 to 316 for layoutreader
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    5468e56 View commit details
    Browse the repository at this point in the history
  2. refactor(pdf_parse): adjust line count threshold for layoutreader

    - Lower the line count threshold from 316 to 200 to ensure compatibility
    - This change aims to prevent potential issues with layoutreader's maximum line support
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    5936684 View commit details
    Browse the repository at this point in the history
  3. Merge pull request #902 from myhloli/fix-line-over-512

    refactor(pdf_parse): adjust line count threshold for layoutreader
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    5f79453 View commit details
    Browse the repository at this point in the history
  4. feat: complete en docs

    xu rui committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    7859c73 View commit details
    Browse the repository at this point in the history
  5. feat: add zh_CN docs

    xu rui committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    91f8cbe View commit details
    Browse the repository at this point in the history
  6. Merge pull request #906 from icecraft/feat/add_en_docs

    Feat/add en docs
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    784c61a View commit details
    Browse the repository at this point in the history
  7. feat: using next_docs

    xu rui committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    aa3df5f View commit details
    Browse the repository at this point in the history
  8. Merge pull request #907 from icecraft/feat/using_next_docs

    feat: using next_docs
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    9581fcd View commit details
    Browse the repository at this point in the history
  9. feat(table): integrate RapidTable model for table recognition

    - Add RapidTable model support for table recognition
    - Update table model configuration and initialization
    - Modify table recognition process to use RapidTable when specified
    - Add RapidTable dependency to setup.py
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    240fe99 View commit details
    Browse the repository at this point in the history
  10. refactor(table): update default table model to Rapid Table

    - Change the default table model from TABLE_MASTER to RAPID_TABLE
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    e78edb1 View commit details
    Browse the repository at this point in the history
  11. Merge pull request #910 from myhloli/dev

    feat(table): integrate RapidTable model for table recognition
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    74fba47 View commit details
    Browse the repository at this point in the history
  12. style(gradio-app): add missing file type in upload

    - Add missing '.jpg' file type to the list of allowed file types for upload
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    8ea2381 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    dd8da7b View commit details
    Browse the repository at this point in the history
  14. Merge pull request #911 from myhloli/dev

    fix(gradio-app): add missing file type in upload
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    8eb699e View commit details
    Browse the repository at this point in the history
  15. refactor(magic_pdf_parse_main): optimize model data handling and JSON…

    … output
    
    - Add orig_model_list parameter to maintain original model data
    - Deep copy model_json and pipe.model_list to preserve data integrity
    - Update json_md_dump function call to include orig_model_list
    - Improve condition check for empty model_json
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    1fc053d View commit details
    Browse the repository at this point in the history
  16. Merge pull request #912 from myhloli/dev

    refactor(magic_pdf_parse_main): optimize model data handling and JSON output
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    1e37e19 View commit details
    Browse the repository at this point in the history
  17. Modify the test directory

    DTwz committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    a09d9e8 View commit details
    Browse the repository at this point in the history
  18. Merge pull request #913 from DTwz/dev

    Modify the test directory
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    b912797 View commit details
    Browse the repository at this point in the history
  19. test(table): improve ppTableModel test coverage

    - Update test_image2html to use unittest framework
    - Add more assertions
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    e65ff19 View commit details
    Browse the repository at this point in the history
  20. Merge pull request #914 from myhloli/dev

    test(table): improve ppTableModel test coverage
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    5e0c9d2 View commit details
    Browse the repository at this point in the history
  21. feat(table): add RapidOCR support for RapidTable model

    - Integrate RapidOCR with RapidTable model for table recognition
    - Improve memory management for devices with <= 8GB VRAM
    - Update table recognition process to use RapidOCR for RapidTable
    - Add rapidocr-paddle dependency in setup.py
    myhloli committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    fe2c2c0 View commit details
    Browse the repository at this point in the history
  22. Merge pull request #915 from myhloli/dev

    feat(table): add RapidOCR support for RapidTable model
    myhloli authored Nov 8, 2024
    Configuration menu
    Copy the full SHA
    5a3872b View commit details
    Browse the repository at this point in the history

Commits on Nov 9, 2024

  1. Configuration menu
    Copy the full SHA
    e75076b View commit details
    Browse the repository at this point in the history

Commits on Nov 11, 2024

  1. Configuration menu
    Copy the full SHA
    7b1984f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f8ac8e1 View commit details
    Browse the repository at this point in the history