From 87cfcc3fee01945ed137d1a97d87f23f33cb25e3 Mon Sep 17 00:00:00 2001
From: sfk <18810651050@163.com>
Date: Mon, 2 Sep 2024 20:54:05 +0800
Subject: [PATCH] Hotfix readme 0.7.1 (#529)
* release: release 0.7.1 version (#526)
* Update README_zh-CN.md (#404) (#409)
correct FAQ url
Co-authored-by: sfk <18810651050@163.com>
* add dockerfile (#189)
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
* Update cla.yml
* Update cla.yml
* feat
: add tablemaster with paddleocr to detect and recognize table (#493)
* Update cla.yml
* Update bug_report.yml
* Update README_zh-CN.md (#404)
correct FAQ url
* Update README_zh-CN.md (#404) (#409) (#410)
correct FAQ url
Co-authored-by: sfk <18810651050@163.com>
* Update FAQ_zh_cn.md
add new issue
* Update FAQ_en_us.md
* Update README_Windows_CUDA_Acceleration_zh_CN.md
* Update README_zh-CN.md
* @Thepathakarpit has signed the CLA in opendatalab/MinerU#418
* Update cla.yml
* feat: add tablemaster_paddle (#463)
* Update README_zh-CN.md (#404) (#409)
correct FAQ url
Co-authored-by: sfk <18810651050@163.com>
* add dockerfile (#189)
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
* Update cla.yml
* Update cla.yml
---------
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: sfk <18810651050@163.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: Xiaomeng Zhao
* (para_split_v2): index out of range issue of span_text first char (#396)
Co-authored-by: liukaiwen
* @Matthijz98 has signed the CLA in opendatalab/MinerU#467
* Create download_models.py
* Create requirements-docker.txt
* feat: add tablemaster with paddleocr to detect and recognize table
* @strongerfly has signed the CLA in opendatalab/MinerU#487
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
---------
Co-authored-by: Xiaomeng Zhao
Co-authored-by: sfk <18810651050@163.com>
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: liukaiwen
* feat: add tablemaster with paddleocr to detect and recognize table (#508)
* Update cla.yml
* Update bug_report.yml
* Update README_zh-CN.md (#404)
correct FAQ url
* Update README_zh-CN.md (#404) (#409) (#410)
correct FAQ url
Co-authored-by: sfk <18810651050@163.com>
* Update FAQ_zh_cn.md
add new issue
* Update FAQ_en_us.md
* Update README_Windows_CUDA_Acceleration_zh_CN.md
* Update README_zh-CN.md
* @Thepathakarpit has signed the CLA in opendatalab/MinerU#418
* Update cla.yml
* feat: add tablemaster_paddle (#463)
* Update README_zh-CN.md (#404) (#409)
correct FAQ url
Co-authored-by: sfk <18810651050@163.com>
* add dockerfile (#189)
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
* Update cla.yml
* Update cla.yml
---------
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: sfk <18810651050@163.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: Xiaomeng Zhao
* (para_split_v2): index out of range issue of span_text first char (#396)
Co-authored-by: liukaiwen
* @Matthijz98 has signed the CLA in opendatalab/MinerU#467
* Create download_models.py
* Create requirements-docker.txt
* feat: add tablemaster with paddleocr to detect and recognize table
* @strongerfly has signed the CLA in opendatalab/MinerU#487
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* Update cla.yml
* Delete .github/workflows/gpu-ci.yml
* Update Huggingface and ModelScope links to organization account
* feat: add tablemaster with paddleocr to detect and recognize table
---------
Co-authored-by: Xiaomeng Zhao
Co-authored-by: sfk <18810651050@163.com>
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: liukaiwen
Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com>
Co-authored-by: wangbinDL
* feat: add tablemaster with paddleocr to detect and recognize table (#511)
* Update cla.yml
* Update bug_report.yml
* Update README_zh-CN.md (#404)
correct FAQ url
* Update README_zh-CN.md (#404) (#409) (#410)
correct FAQ url
Co-authored-by: sfk <18810651050@163.com>
* Update FAQ_zh_cn.md
add new issue
* Update FAQ_en_us.md
* Update README_Windows_CUDA_Acceleration_zh_CN.md
* Update README_zh-CN.md
* @Thepathakarpit has signed the CLA in opendatalab/MinerU#418
* Update cla.yml
* feat: add tablemaster_paddle (#463)
* Update README_zh-CN.md (#404) (#409)
correct FAQ url
Co-authored-by: sfk <18810651050@163.com>
* add dockerfile (#189)
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
* Update cla.yml
* Update cla.yml
---------
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: sfk <18810651050@163.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: Xiaomeng Zhao
* (para_split_v2): index out of range issue of span_text first char (#396)
Co-authored-by: liukaiwen
* @Matthijz98 has signed the CLA in opendatalab/MinerU#467
* Create download_models.py
* Create requirements-docker.txt
* feat: add tablemaster with paddleocr to detect and recognize table
* @strongerfly has signed the CLA in opendatalab/MinerU#487
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* Update cla.yml
* Delete .github/workflows/gpu-ci.yml
* Update Huggingface and ModelScope links to organization account
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
* feat: add tablemaster with paddleocr to detect and recognize table
---------
Co-authored-by: Xiaomeng Zhao
Co-authored-by: sfk <18810651050@163.com>
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: liukaiwen
Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com>
Co-authored-by: wangbinDL
---------
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: sfk <18810651050@163.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: Xiaomeng Zhao
Co-authored-by: Kaiwen Liu
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: liukaiwen
Co-authored-by: wangbinDL
* Update README.md
* Update README_zh-CN.md
* Update README_zh-CN.md
---------
Co-authored-by: yyy <102640628+dt-yy@users.noreply.github.com>
Co-authored-by: drunkpig <60862764+drunkpig@users.noreply.github.com>
Co-authored-by: Aoyang Fang <222010547@link.cuhk.edu.cn>
Co-authored-by: Xiaomeng Zhao
Co-authored-by: Kaiwen Liu
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: liukaiwen
Co-authored-by: wangbinDL
---
README.md | 4 +++-
README_zh-CN.md | 6 +++---
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/README.md b/README.md
index 09ff0d75..414ebffc 100644
--- a/README.md
+++ b/README.md
@@ -164,7 +164,9 @@ In non-mainline environments, due to the diversity of hardware and software conf
Recommended Configuration 16G+ VRAM |
3090/3090ti/4070ti super/4080/4090
- 16G or more can enable layout, formula recognition, and OCR acceleration simultaneously |
+ 16G or more can enable layout, formula recognition, and OCR acceleration simultaneously
+ 24G or more can enable layout, formula recognition, OCR acceleration and table recognition simultaneously
+
diff --git a/README_zh-CN.md b/README_zh-CN.md
index 4bc6a06f..77f70a52 100644
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -164,7 +164,9 @@ https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
推荐配置 16G+显存 |
3090/3090ti/4070tisuper/4080/4090
- 16G及以上可以同时开启layout,公式识别和ocr加速 |
+ 16G及以上可以同时开启layout,公式识别和ocr加速
+ 24G及以上可以同时开启layout,公式识别,ocr加速和表格识别
+
@@ -341,11 +343,9 @@ TODO
- 漫画书、艺术图册、小学教材、习题尚不能很好解析
- 在一些公式密集的PDF上强制启用OCR效果会更好
- 如果您要处理包含大量公式的pdf,强烈建议开启OCR功能。使用pymuPDF提取文字的时候会出现文本行互相重叠的情况导致公式插入位置不准确。
-
- **表格识别**目前处于测试阶段,识别速度较慢,识别准确度有待提升。
-
# FAQ