You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thank you for developing PDF Extract Kit, it is a great tool for extracting data from PDFs!
I would like to propose a feature that could improve the performance of the tool in certain use cases. Specifically, it would be useful to add a flag or option that allows users to disable certain models, such as formula detection, during the PDF extraction process.
Problem
In some scenarios, users may not need every model to be applied during the extraction process (e.g., formula detection). Currently, it seems that all models are executed by default, which can increase the runtime of the extraction process unnecessarily for cases where certain models aren’t needed.
Proposed Solution
Add a command-line option (or configuration flag) that allows users to selectively enable or disable specific models. For instance:
A flag like --no-formula to skip formula detection.
Alternatively, a general flag system where the user can specify which models they want to run (e.g., --models text,table,figure).
Expected Benefits
Improved performance: By skipping certain models, the execution time for PDF processing can be reduced significantly in use cases where full model execution isn’t necessary.
Flexibility: Users would have more control over which models to use, tailoring the tool to their specific needs.
Thank you for considering this feature request. It would be a great enhancement for performance optimization in scenarios where only a subset of models is needed.
The text was updated successfully, but these errors were encountered:
Hello,
First of all, thank you for developing PDF Extract Kit, it is a great tool for extracting data from PDFs!
I would like to propose a feature that could improve the performance of the tool in certain use cases. Specifically, it would be useful to add a flag or option that allows users to disable certain models, such as formula detection, during the PDF extraction process.
Problem
In some scenarios, users may not need every model to be applied during the extraction process (e.g., formula detection). Currently, it seems that all models are executed by default, which can increase the runtime of the extraction process unnecessarily for cases where certain models aren’t needed.
Proposed Solution
Add a command-line option (or configuration flag) that allows users to selectively enable or disable specific models. For instance:
Expected Benefits
Thank you for considering this feature request. It would be a great enhancement for performance optimization in scenarios where only a subset of models is needed.
The text was updated successfully, but these errors were encountered: