Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast listing of file names contained in Zip archive #240

Open
MoreDelay opened this issue Aug 26, 2024 · 0 comments
Open

Fast listing of file names contained in Zip archive #240

MoreDelay opened this issue Aug 26, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@MoreDelay
Copy link

Is your feature request related to a problem? Please describe.
I ported a small utility that I wrote in bash to rust that handles some file conversion within zip archives. I used the tool 7z to list files in archives and if it finds some file it can convert, it extracts the archive, converts the files and compresses them again. After porting my script to Rust using this crate, I noticed that listing the file names takes a lot longer than it used to with 7z. Most of my archives are already completed and only new archives need conversion, so this makes for a relatively huge performance drop iterating through the old ones.

Describe the solution you'd like
I'm not very familiar with the zip file spec, but it seems there is a central dictionary that contains all file names (and other metadata) at the end of the archive. It should not be too difficult, and fast, to read just over that last portion of the file. As far as I can see, there is only the function file_names that gives access to the file names. I looked through the code a little, and I think right now it has to read through the whole archive and create a mapping from file names to its binary blob, which is wasted computation most of the time in my case.

I tried to iterate through the archive using 0..archive.len() and indexing the file to get their names, but this does not seem to make any difference on performance.

Describe alternatives you've considered
I haven't found another crate that provides just the listing functionality. As I said, 7z implements this but that is written in C++.

@MoreDelay MoreDelay added the enhancement New feature or request label Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants
@MoreDelay and others