-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize merge
algorithm for data sizes equal or greater then 4M items
#1933
base: main
Are you sure you want to change the base?
Conversation
5a8ff9e
to
fedebda
Compare
…introduce new function __find_start_point_in Signed-off-by: Sergey Kopienko <[email protected]>
…introduce __parallel_merge_submitter_large for merge of biggest data sizes Signed-off-by: Sergey Kopienko <[email protected]>
142ffa0
to
a6164fd
Compare
…using __parallel_merge_submitter_large for merge data equal or greater then 4M items Signed-off-by: Sergey Kopienko <[email protected]>
a6164fd
to
d4721ca
Compare
Signed-off-by: Sergey Kopienko <[email protected]>
auto __scratch_acc = __result_and_scratch.template __get_scratch_acc<sycl::access_mode::write>( | ||
__cgh, __dpl_sycl::__no_init{}); | ||
|
||
__cgh.parallel_for<_FindSplitPointsKernelOnMidDiagonal>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rarutyun Compile error is here:
https://github.com/oneapi-src/oneDPL/actions/runs/11722920053/job/32653481992?pr=1933
D:\a\oneDPL\oneDPL\include\oneapi\dpl\pstl\hetero\dpcpp\parallel_backend_sycl_merge.h(322,64): error: definition with same mangled name '...' as another definition
…fix compile error Signed-off-by: Sergey Kopienko <[email protected]>
|
||
_PRINT_INFO_IN_DEBUG_MODE(__exec); | ||
|
||
using _FindSplitPointsOnMidDiagonalKernel = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rarutyun I have fixed the error here. Is it correct way?
I am using __kernel_name_generator
here because I should have two Kernel names: one passed as template parameter pack and the second name I should create inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't yet looked at this in detail, but can't we just pass the _IdType
to __kernel_name_generator
directly, and use a single _find_split_points_kernel_on_mid_diagonal
type?
In this PR we optimize
merge
algorithm for data sizes equal or greater then 4M items.The main idea - we doing two submits:
submit
we find split point in some"base"
diagonal's subset.submit
we find split points in all other diagonal and runserial merge
for each diagonal (as before).But when we find split point on the current diagonal, we setup some indexes limits for
rng1
and 'rng2'.For these limits we load split point's data from previous and next
"base"
diagonals, calculated on the step (1).Applying this approach we have good perf profit for biggest data sizes with
float
andint
data types.As additional profit, we have sign performance boost for small and middle data sizes in the
merge_sort
algorithm.