get_pixmap may cause large image size #25

BrentHuang · 2024-03-18T12:41:43Z

doc = fitz.open(pdf_file)
for page in doc:
pix = page.get_pixmap()
img_file = f'{img_file_prefix}-{page.number}.jpg'
pix.save(img_file)

Will get_pixmap cause the generated JPG image to be too large in the above code? Is there a better way to convert every page in the PDF into a JPG image?

JorjMcKie · 2024-03-18T13:10:51Z

The pixmap size is directly linked to the page size.
Roughly width * height * 3 (for RGB images).
You can reduce this in a number of ways, e.g. using grayscale instead of RGB (reduce by factor 3), or by downscaling using a DPI value < 96.

JorjMcKie · 2024-03-18T13:12:42Z

When you save the JPEG you can also influence the quality - please see documentation.
A lower quality value also reduces the image size.
Also try PNG instead - it may compress better than JPG.

JorjMcKie added the help wanted label Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get_pixmap may cause large image size #25

get_pixmap may cause large image size #25

BrentHuang commented Mar 18, 2024

JorjMcKie commented Mar 18, 2024

JorjMcKie commented Mar 18, 2024

get_pixmap may cause large image size #25

get_pixmap may cause large image size #25

Comments

BrentHuang commented Mar 18, 2024

JorjMcKie commented Mar 18, 2024

JorjMcKie commented Mar 18, 2024