Add Option to Disable Specific Models (e.g., Formula Detection) for Faster Execution #140

Leopold-iziwork · 2024-10-08T07:36:05Z

Hello,

First of all, thank you for developing PDF Extract Kit, it is a great tool for extracting data from PDFs!

I would like to propose a feature that could improve the performance of the tool in certain use cases. Specifically, it would be useful to add a flag or option that allows users to disable certain models, such as formula detection, during the PDF extraction process.

Problem

In some scenarios, users may not need every model to be applied during the extraction process (e.g., formula detection). Currently, it seems that all models are executed by default, which can increase the runtime of the extraction process unnecessarily for cases where certain models aren’t needed.

Proposed Solution

Add a command-line option (or configuration flag) that allows users to selectively enable or disable specific models. For instance:

A flag like --no-formula to skip formula detection.
Alternatively, a general flag system where the user can specify which models they want to run (e.g., --models text,table,figure).

Expected Benefits

Improved performance: By skipping certain models, the execution time for PDF processing can be reduced significantly in use cases where full model execution isn’t necessary.
Flexibility: Users would have more control over which models to use, tailoring the tool to their specific needs.

Thank you for considering this feature request. It would be a great enhancement for performance optimization in scenarios where only a subset of models is needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Option to Disable Specific Models (e.g., Formula Detection) for Faster Execution #140

Add Option to Disable Specific Models (e.g., Formula Detection) for Faster Execution #140

Leopold-iziwork commented Oct 8, 2024 •

edited

Loading

Add Option to Disable Specific Models (e.g., Formula Detection) for Faster Execution #140

Add Option to Disable Specific Models (e.g., Formula Detection) for Faster Execution #140

Comments

Leopold-iziwork commented Oct 8, 2024 • edited Loading

Leopold-iziwork commented Oct 8, 2024 •

edited

Loading