You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently, in the documents to process, I received a document scanned with a Lexmark machine that become blank after saving with “clean” option set to True. This behavior starts with version 1.24.0, 1.23.26 works fine.
What are you trying to achieve with this parameter?
It is equivalent to executing page.clean_contents() for all its pages. For PDF as you used it for, it has no positive effect in any case. It never reduces files size if not used together with garbage collection and compression.
Of course what you experienced shouldn't happen either.
It was only minimal example to reproduce the bug.
I’m using pymupdf for preprocessing mix of pdf documents form different sources, quiet a lot of files, including merging, splitting, adding blank pages for parity, to streamline printing. Some are generated in older software, or scanned on older machines with old firmware, and have problems. Among them there are pdfs with broken blank pages I wrote about a few months ago, or with broken coordinate system (unclosed transformations as I understand). Flawed files come and go, disappearing characters, nulls after eof. Typically works when opened directly but breaks on save, or with whole setup gives hard to adress problems. Cleaning helps and currently I have to use it, I wish cleaning would purge them all :). And I’m using garbage option too.
Description of the bug
Recently, in the documents to process, I received a document scanned with a Lexmark machine that become blank after saving with “clean” option set to True. This behavior starts with version 1.24.0, 1.23.26 works fine.
How to reproduce the bug
Here is a public document scanned with Lexmark, found in the internet:
https://www.feb.unesp.br/Home/Administracao110/DTAd/Compras/empenhos---19_06.pdf
PyMuPDF version
1.24.13
Operating system
Windows
Python version
3.12
The text was updated successfully, but these errors were encountered: