You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I ported a small utility that I wrote in bash to rust that handles some file conversion within zip archives. I used the tool 7z to list files in archives and if it finds some file it can convert, it extracts the archive, converts the files and compresses them again. After porting my script to Rust using this crate, I noticed that listing the file names takes a lot longer than it used to with 7z. Most of my archives are already completed and only new archives need conversion, so this makes for a relatively huge performance drop iterating through the old ones.
Describe the solution you'd like
I'm not very familiar with the zip file spec, but it seems there is a central dictionary that contains all file names (and other metadata) at the end of the archive. It should not be too difficult, and fast, to read just over that last portion of the file. As far as I can see, there is only the function file_names that gives access to the file names. I looked through the code a little, and I think right now it has to read through the whole archive and create a mapping from file names to its binary blob, which is wasted computation most of the time in my case.
I tried to iterate through the archive using 0..archive.len() and indexing the file to get their names, but this does not seem to make any difference on performance.
Describe alternatives you've considered
I haven't found another crate that provides just the listing functionality. As I said, 7z implements this but that is written in C++.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I ported a small utility that I wrote in bash to rust that handles some file conversion within zip archives. I used the tool 7z to list files in archives and if it finds some file it can convert, it extracts the archive, converts the files and compresses them again. After porting my script to Rust using this crate, I noticed that listing the file names takes a lot longer than it used to with 7z. Most of my archives are already completed and only new archives need conversion, so this makes for a relatively huge performance drop iterating through the old ones.
Describe the solution you'd like
I'm not very familiar with the zip file spec, but it seems there is a central dictionary that contains all file names (and other metadata) at the end of the archive. It should not be too difficult, and fast, to read just over that last portion of the file. As far as I can see, there is only the function file_names that gives access to the file names. I looked through the code a little, and I think right now it has to read through the whole archive and create a mapping from file names to its binary blob, which is wasted computation most of the time in my case.
I tried to iterate through the archive using
0..archive.len()
and indexing the file to get their names, but this does not seem to make any difference on performance.Describe alternatives you've considered
I haven't found another crate that provides just the listing functionality. As I said, 7z implements this but that is written in C++.
The text was updated successfully, but these errors were encountered: