Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove MaybeDone from tuple::join #74

Conversation

matheus-consoli
Copy link
Collaborator

@matheus-consoli matheus-consoli commented Nov 14, 2022

Ref #22

Remove the need for MaybeDone on tuple::join by creating for each future three fields: the Future itself; its result; its state.

I had to add a dependency on paste to create the $F_fields.

Bench:

>> critcmp main patch -f tuple::join
group             main                                   patch
-----             ----                                   -----
tuple::join 10    1.00    232.6±9.13ns        ? ?/sec    1.14    266.0±3.09ns        ? ?/sec

Copy link
Owner

@yoshuawuyts yoshuawuyts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you so much!

Copy link
Owner

@yoshuawuyts yoshuawuyts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, on second thought: I just noticed the perf regression and I think it might be because we're doing extra copies at the end. Meaning we've added extra machinery, but don't quite reap the benefits we hoped we would.

I've added some in-line comments explaining what's going on. Do you think you could update it to work that way?

Comment on lines 29 to 33
$(
#[pin] $F: $F,
[<$F _out>]: MaybeUninit<$F::Output>,
[<$F _state>]: PollState,
)*
Copy link
Owner

@yoshuawuyts yoshuawuyts Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to create named fields here instead. I'm not sure about the exact syntax, but I'm thinking something along these lines?

Suggested change
$(
#[pin] $F: $F,
[<$F _out>]: MaybeUninit<$F::Output>,
[<$F _state>]: PollState,
)*
futures: ($(#[pin]$F,)*),
output: ($(MaybeUninit<$F::Output>,)*),
state: ($(PollState,)*),

Comment on lines 79 to 81
paste! {
Poll::Ready(($( unsafe { this.[<$F _out>].assume_init_read() }),*))
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regression we're seeing is probably because of this. The main performance benefit of #22 is that we prevent an extra copy at the end. Say we have a tuple (A, B), the output would always be (A::Output, B::Output) unless the future is cancelled.

Because we know this we can then store a field in-line of: (MaybeUninit<A::Output>, MaybeUninit<B::Output>), which once we know all fields have been initialized we can just return from the future as-is. This patch unfortunately doesn't quite do that; in order to mark the fields as "initialized" here, it does a move.

That's why I'm suggesting we instead find a way to store futures, output, and state as separate fields of tuples - and find a way to index into them.

*this.[<$F _state>] = PollState::Done;
}
}
all_done &= this.[<$F _state>].is_done();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof haha, I forgot we had this. We should probably initialize a counter set to len: utils::tuple_len!() inside the struct, and then with each completed future count down. That will also set us up to enable #21 to work, since we'll want to stop calling all wakers on each iteration anyway. But that doesn't have to be inside this patch.

@matheus-consoli
Copy link
Collaborator Author

I've reworked this a bit:

  • no paste dep
  • no tuple copy from (MaybeUninit<A>,..) to (A,)

but it got somewhat worst hahaha

I'll try to figure out a better way to impl this later, but let me know if you have any ideas :)

>> critcmp main patch -f tuple::join
group             main                                   patch
-----             ----                                   -----
tuple::join 10    1.00    231.6±0.73ns        ? ?/sec    1.19    275.7±1.31ns        ? ?/sec
for reference, the macro is desugaring `Join3` as follows:
#[pin_project]
#[must_use = "futures do nothing unless you `.await` or poll them"]
#[allow(non_snake_case)]
pub struct Join3<A: Future, B: Future, C: Future> {
    len: u32,
    #[pin] A: A,
    #[pin] B: B,
    #[pin] C: C,
    outputs: (
        MaybeUninit<A::Output>,
        MaybeUninit<B::Output>,
        MaybeUninit<C::Output>,
    ),
    states: PollStates,
}

impl<A: Future, B: Future, C: Future> Future for Join3<A, B, C> {
    type Output = (A::Output, B::Output, C::Output);
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        let mut this = self.project();
        if this.states[0].is_pending() {
            if let Poll::Ready(out) = this.A.poll(cx) {
                this.outputs.0 = MaybeUninit::new(out);
                this.states[0] = PollState::Done;
                *this.len -= 1;
            }
        };
        if this.states[1].is_pending() {
            if let Poll::Ready(out) = this.B.poll(cx) {
                this.outputs.1 = MaybeUninit::new(out);
                this.states[1] = PollState::Done;
                *this.len -= 1;
            }
        };
        if this.states[2].is_pending() {
            if let Poll::Ready(out) = this.C.poll(cx) {
                this.outputs.2 = MaybeUninit::new(out);
                this.states[2] = PollState::Done;
                *this.len -= 1;
            }
        };
        if *this.len <= 0 {
            let out = unsafe {
                (this.outputs as *const _ as *const (A::Output, B::Output, C::Output)).read()
            };
            Poll::Ready(out)
        } else {
            Poll::Pending
        }
    }
}

src/future/join/tuple.rs Show resolved Hide resolved
done: bool,
$(#[pin] $F: MaybeDone<$F>,)*
len: u32,
$(#[pin] $F: $F,)*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to instead do:

Suggested change
$(#[pin] $F: $F,)*
tuple: ($(#[pin] $F: $F,)*),

That way we can move the tuple fields into here basically in-place.

Copy link
Collaborator Author

@matheus-consoli matheus-consoli Nov 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently trying to figure out how to do this!

tuple: ($(#[pin] $F: $F,)*) doesn't work because pin_project doesn't support pinning this way, like (#[pin] T, #[pin] S)

play

Copy link
Collaborator Author

@matheus-consoli matheus-consoli Nov 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yoshuawuyts, I got something!

It's on a second branch, here.

I took some inspiration from #84 to use the pinned futures inside a tuple(struct), and it kinda works.

We have some performance gain, but it's kinda negligible 😞 (especially thinking about the macro-heavy impl orientation)

>> critcmp main patch -f tuple::join
group             main                                   patch
-----             ----                                   -----
tuple::join 10    1.03    239.3±1.06ns        ? ?/sec    1.00    231.3±0.79ns        ? ?/sec

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I like the intermediate struct approach!

The reason why there's a small perf cost might be because you're using ptr::read there which ends up doing an extra copy? Instead I think if it's at all possible it might be worth attempting to perform a direct transmute? This might require adding a #[repr(transparent)] to the Futures struct so that the layout is guaranteed to be the same as the tuples it stores?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I don't think #[repr(transparent)] Futures is possible, transparent only allows one field to have a non-zero size, the Futures have many.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I don't think #[repr(transparent)] Futures is possible, transparent only allows one field to have a non-zero size, the Futures have many.

oof, TIL :')

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oook, it seems to me that we cannot transmute (MaybeUninit<T>, MaybeUninit<U>) to (T, U).

I'm not sure why, but I guess it's the same underlying problem from rust-lang/rust#61956, so casting the tuples appears to be the best option so far

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transmuting is not what you want here I think ?

let t = (MaybeUninit::<usize>::new(42), MaybeUninit::<u8>::new(12)); // source, can come from anywhere

/* ... */

// Where you wanted to transmute:
let (a, b) = t;
(a.assume_init(), b.assume_init())

This can be macro-ified using $F as the variable name: it won't even shadow the type names because that's two different namespaces (let usize: usize = 4_usize; is valid)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming the t is owned, this should not do any copy anywhere

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh, thank you for pointing it out! I'll try that

Copy link
Owner

@yoshuawuyts yoshuawuyts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just approved #96. It seems this PR has merge conflicts anyway; if those can be resolved I think this should be good to go as well.

Either as part of this PR, or as part of a follow-up PR we should switch this implementation over to using tuple_for_each! as well, so the underlying implementation can be shared between all tuples.

len: LEN,
$($F: $F.into_future(),)*
outputs: ($(MaybeUninit::<$F::Output>::uninit(),)*),
states: PollStates::new(LEN as usize),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PollArray is usable here

@matheus-consoli
Copy link
Collaborator Author

I'm closing this PR in favor of #103

I opted to start from a clean new branch instead of dealing with this branch conflicts 😅

Thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants