Skip to content
/ pdf-info Public

Reads page count information from PDFs across a filesystem - implemented in Python

Notifications You must be signed in to change notification settings

arn-e/pdf-info

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Quick and fun python script to determine of PDF files across a  
file system.  It takes a delimited list of filenames as input  
and accesses a SQL database via ODBC to retrieve the location of  
the files to be scanned.  It could just as easily have been built  
to use an input list of full relative paths, however this information  
was already stored in a database.

It will not run as-is, obviously, since it depends on a specific  
database schema.  The field names have been replaced with generic  
placeholders.

The PyPDF library is used, as well as a more 'manual' approach :  
reading page markers from the plain text of a PDF file.  This proved  
to be necessary since the PyPDF module has trouble with certain PDF  
versions, such as 1.3 and 1.6.

About

Reads page count information from PDFs across a filesystem - implemented in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages