forked from intel/llvm
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[CUDA][LIBCLC] Implement RC11 seq_cst for PTX6.0 (intel#12516)
Implement `seq_cst` RC11/ptx6.0 memory consistency for CUDA backend. See https://dl.acm.org/doi/pdf/10.1145/3297858.3304043 and https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#memory-consistency-model for full details. Requires sm_70 or above. With this PR there is now a complete mapping between SYCL memory consistency model capabilities and the official CUDA model, fully exploiting CUDA capabilities when possible on supported arches. This makes the SYCL-CTS atomic_ref tests fully pass for sm_70 on the cuda backend. Fixes intel#11208 Depends on intel#12907 --------- Signed-off-by: JackAKirk <[email protected]>
- Loading branch information
Showing
6 changed files
with
53 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -81,14 +81,14 @@ if(SYCL_PI_UR_USE_FETCH_CONTENT) | |
CACHE PATH "Path to external '${name}' adapter source dir" FORCE) | ||
endfunction() | ||
|
||
set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git") | ||
# commit 6513abc404979fa109d64500bf899e632d511291 | ||
# Merge: 09be0881 6d586094 | ||
set(UNIFIED_RUNTIME_REPO "https://github.com/oneapi-src/unified-runtime.git") | ||
# commit 29ee45c4451a682f744146cc9dbeb2617ecdd6b3 | ||
# Merge: db4b0c14 4f5d005a | ||
# Author: Kenneth Benzie (Benie) <[email protected]> | ||
# Date: Thu Mar 14 22:38:53 2024 +0000 | ||
# Merge pull request #1410 from kbenzie/benie/cmake-external-adapter-source-dirs | ||
# [CMake] Support external adapter source dirs | ||
set(UNIFIED_RUNTIME_TAG 6513abc404979fa109d64500bf899e632d511291) | ||
# Date: Mon Mar 18 12:14:26 2024 +0000 | ||
# Merge pull request #1291 from JackAKirk/cuda-seq-cst-b | ||
# [CUDA] Report that devices with cc >= sm_70 support seq_cst | ||
set(UNIFIED_RUNTIME_TAG 29ee45c4451a682f744146cc9dbeb2617ecdd6b3) | ||
|
||
if(SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO) | ||
set(UNIFIED_RUNTIME_REPO "${SYCL_PI_UR_OVERRIDE_FETCH_CONTENT_REPO}") | ||
|