-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: prototype async API, with demonstrable perf improvements via benchmark #73
Conversation
This reverts commit 1da7060.
# Conflicts: # Cargo.toml # examples/write_dir.rs # src/compression.rs # src/crc32.rs # src/lib.rs # src/read.rs # src/result.rs # src/spec.rs # src/types.rs # src/write.rs
/* } */ | ||
|
||
/* let write_data = ready!(s.ring.poll_write(cx, s.ring.capacity())); */ | ||
/* /\* FIXME: i'm pretty sure this can cause a segfault with the current */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
use std::{cmp, task::ready}; | ||
|
||
pub trait KnownExpanse { | ||
/* TODO: make this have a parameterized Self::Index type, used e.g. with RangeInclusive or |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
let fname = file.file_name.as_bytes(); | ||
|
||
if writer.is_write_vectored() { | ||
/* TODO: zero-copy!! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
let fname = file.file_name.as_bytes(); | ||
|
||
if writer.is_write_vectored() { | ||
/* TODO: zero-copy!! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
let mut name_as_string = name.into(); | ||
// Append a slash to the filename if it does not end with it. | ||
if !name_as_string.ends_with('/') { | ||
/* TODO: ends_with('\\') as well? */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
}; | ||
|
||
if writer.is_write_vectored() { | ||
/* TODO: zero-copy?? */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
}; | ||
|
||
if writer.is_write_vectored() { | ||
/* TODO: zero-copy?? */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
) -> impl Stream<Item = ZipResult<Pin<Box<ZipFile<'_, S, Limiter<S>, Sh>>>>> + '_ { | ||
let len = self.shared.len(); | ||
let s = std::cell::UnsafeCell::new(self); | ||
/* FIXME: make this a stream with a known length! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
}; | ||
|
||
if writer.is_write_vectored() { | ||
/* TODO: zero-copy!! */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
|
||
#[inline] | ||
pub fn is_dir(&self) -> bool { | ||
/* TODO: '\\' too? */ |
Check notice
Code scanning / devskim
A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note
is this still in progress @Pr0methean? Would love to see parallelization in this crate! Thanks for your work :D |
You'll have to ask @cosmicexplorer, since he's the PR author. |
Thanks for the ping! I think #72 is probably worth pursuing first, as it achieves parallelism without the separate effort of an async API (which via tokio includes I will definitely take a look at updating #72, since it makes use of several techniques I initially developed for pex-tool/pex#2175 (which I should also get back to) and which I don't believe have yet been implemented elsewhere. @matthewgapp let me know if synchronous parallel extraction/zipping serves your purposes -- I think an async version of that (which is what this change does) would be very cool, but it requires a lot of extra code, and is slower on all the benchmarks I've tried. If there is some user that's performing a very large number of parallel zip extractions and finds this PR to be more performant than #72, I think it might be worth taking a look at again, but I would at least want to know what performance scenario could possibly produce that, because I wasn't able to figure out how to create such a scenario on my laptop. |
The massive amount of changes required here mean I would prefer not to maintain this myself or force @Pr0methean to do so unless there is a very specific use case for which this achieves superior performance to the synchronous work in #72. In general, I'm thinking that if #72 does its job right, it should be fast enough that we don't need to rely on |
Pull request was closed
@cosmicexplorer sync parallel execution is just fine! We'd be using it in async code so it would have to be Send. |
Ok, great!! Now that I have @Pr0methean reviewing my work I'm thinking a full async API might not be a terrible idea at some point, but I might want to try an mmap-based implementation first. I started playing around with linux-specific splice and copy_file_range and I suspect that would accelerate the split/merge operations used for bulk creation/extraction, but mmap may be more performant for random access operations. In general the performance improvements from here on out probably require using a higher-level and less-general API (as per my research in https://github.com/cosmicexplorer/medusa-zip), and I don't expect they'd replace the standard API of this crate. |
This replaces zip-rs/zip-old#409.