Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for set_intersection copying from second iterator range #983

Merged

Conversation

danhoeflinger
Copy link
Contributor

Bug Description

As implemented before this PR, set_intersection using oneapi::dpl::execution::par has a bug which results in incorrect semantics for some situations.
For performance reasons, in some situations, we would prefer to reverse the order of arguments in the parallel loop for input iterators. However, this results in copying from the second set of iterators which is against semantics found here:

https://en.cppreference.com/w/cpp/algorithm/set_intersection
..."elements will be copied from [first1, last1) to the output range, preserving order."

In cases where the comparator does not compare ALL of the input type, this can be very important. For instance, a comparator for a key value pair which only considers the key.

Fix Summary

This PR fixes this bug by passing a functor to __set_intersection_construct which selects which element to copy to the output range. This allows us to keep the structure of the reversed iterator arguments without the issue of copying from the wrong input set.

This PR also adds a test which fails to compile without this fix using a key value pair example which only compares the key. It uses a oneapi::dpl::counting_iterator<int> as a "value" for the first set of iterators, a oneapi::dpl::discard_iterator as a "value" for the second set of iterators, and a sequence of integers for the output "values". If the implementation were to attempt to copy from the second set of iterators, a compiler error will occur when attempting to assign the dereferenced oneapi::dpl::discard_iterator to an integer.

@danhoeflinger
Copy link
Contributor Author

I'm investigating the failure for windows, cl, but I'm thus far unable to reproduce it...

@danhoeflinger
Copy link
Contributor Author

danhoeflinger commented Jun 20, 2023

I'm investigating the failure for windows, cl, but I'm thus far unable to reproduce it...

I'm still unable to reproduce the failure in practice. However, examining the code a bit more, I think that our usage of ::new (::std::addressof(*__result)) _Tp(*(__select_element(__first1, __first2))); is generally incorrect when __result can be a zip_iterator.

For a zip_iterator, operator*() returns a rvalue oneapi::dpl::__internal::tuple of references to each zipped element. The addressof of an rvalue should be deleted. Its surprising to me that this new test case works at all in hindsight.

It seems that we need a specialization for zip_iterator which results in a placement ::new uninitialized copy for each element in the tuple, rather than the tuple as a whole. I've prototyped this and I'm looking to add this.

I believe there are other places in the code which will also require similar changes.

In memory_impl.h, we have many similar usages of input iterators with ::std::addressof for uninitialized_copy, uinitialized_move, uninitialized_fill, uninitialized_default_construct, and __op_uninitialized_value_construct.

Also, we have similar usage in __pattern_inplace_merge which will likely need adjustment.

Unless I am missing something, all of these are affected and may be broken for zip_iterators.

@danhoeflinger
Copy link
Contributor Author

I'm still unable to reproduce the failure in practice. However, examining the code a bit more, I think that our usage of ::new (::std::addressof(*__result)) _Tp(*(__select_element(__first1, __first2))); is generally incorrect when __result can be a zip_iterator.

I was missing something in __parallel_set_op where it creates a temporary __par_backend::__buffer of the output value_type to write to, before using __brick_move_destroy to move it to the final result (zip_iterator). This means that this instance of std::addressof isn't invalid because we aren't dealing with a zip_iterator, but rather a buffer of value_type tuples.

I am still investigating the source of the error for windows-2019, cl and tbb.

(Also the general issue of std::addressof is still worth investigating and I will make an issue for it.)

@danhoeflinger danhoeflinger force-pushed the dev/dhoeflin/set_intersection_copy_from_first_range branch 2 times, most recently from 359b770 to 4a05536 Compare September 13, 2023 19:08
@akukanov akukanov added the follow through PRs/issues that should be completed/resolved label Dec 5, 2023
@danhoeflinger danhoeflinger added this to the 2022.6.0 milestone Jan 29, 2024
@danhoeflinger danhoeflinger force-pushed the dev/dhoeflin/set_intersection_copy_from_first_range branch from 4a05536 to 69cf08c Compare April 1, 2024 20:23
Signed-off-by: Dan Hoeflinger <[email protected]>
@danhoeflinger
Copy link
Contributor Author

After rebasing with current main, it seems the failure for windows-2019, cl and tbb is no longer there. I think we can go ahead and re-review the PR and consider merging.

Copy link
Contributor

@mmichel11 mmichel11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked through both the patch and testing changes and they LGTM.

@danhoeflinger
Copy link
Contributor Author

@SergeyKopienko (since you have left feedback already) and others:
I plan to merge ~April 16th with the existing approval if there are no objections.

Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Copy link
Contributor

@SergeyKopienko SergeyKopienko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@julianmi julianmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@danhoeflinger danhoeflinger merged commit ecb7cd5 into main Apr 12, 2024
20 checks passed
@danhoeflinger danhoeflinger deleted the dev/dhoeflin/set_intersection_copy_from_first_range branch April 12, 2024 15:28
@danhoeflinger
Copy link
Contributor Author

Thanks all for the reviews. With merging with the existing approvals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
follow through PRs/issues that should be completed/resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants