This repository has been archived by the owner on Jan 20, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 17
[OpenMP] Add OpenMP v6.0 API Routines omp_target_memset() and omp_target_memset_sync() #239
Closed
mjklemm
wants to merge
7
commits into
ROCm-Developer-Tools:amd-trunk-dev
from
mjklemm:omp_target_memset
Closed
[OpenMP] Add OpenMP v6.0 API Routines omp_target_memset() and omp_target_memset_sync() #239
mjklemm
wants to merge
7
commits into
ROCm-Developer-Tools:amd-trunk-dev
from
mjklemm:omp_target_memset
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There is a TODO to implement a fast path that uses an on-device kernel instead of the host-based memory fill operation. This may require some additional plumbing to have kernels in libomptarget.so
You can test this locally with the following command:git-clang-format --diff 816c42d880d9d146c5d7cd6cd48a99324a8e2edf 012952ada089b623a7d361a00f74f787481ee9b0 -- clang/test/Layout/ms-no-unique-address.cpp clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp libc/src/math/expm1.h libc/src/math/generic/expm1.cpp libc/test/src/math/expm1_test.cpp libc/test/src/math/smoke/expm1_test.cpp llvm/include/llvm/ExecutionEngine/Orc/Debugging/DebugInfoSupport.h llvm/lib/ExecutionEngine/Orc/Debugging/DebugInfoSupport.cpp openmp/libomptarget/test/api/omp_target_memset.c clang/include/clang/Basic/ParsedAttrInfo.h clang/include/clang/Basic/SourceManager.h clang/include/clang/Lex/HeaderSearch.h clang/include/clang/Lex/ModuleMap.h clang/include/clang/Sema/Sema.h clang/lib/AST/ASTContext.cpp clang/lib/AST/Decl.cpp clang/lib/AST/Interp/Boolean.h clang/lib/AST/Interp/ByteCodeExprGen.cpp clang/lib/AST/Interp/Floating.h clang/lib/AST/Interp/Integral.h clang/lib/AST/Interp/Interp.h clang/lib/AST/Interp/InterpBuiltin.cpp clang/lib/AST/Interp/Pointer.h clang/lib/AST/RecordLayoutBuilder.cpp clang/lib/Analysis/ThreadSafety.cpp clang/lib/Basic/SourceManager.cpp clang/lib/Basic/Targets/NVPTX.h clang/lib/CodeGen/Targets/NVPTX.cpp clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/VEToolchain.cpp clang/lib/Format/WhitespaceManager.cpp clang/lib/Frontend/FrontendAction.cpp clang/lib/Interpreter/IncrementalExecutor.cpp clang/lib/Lex/HeaderSearch.cpp clang/lib/Lex/ModuleMap.cpp clang/lib/Lex/PPDirectives.cpp clang/lib/Parse/ParseDeclCXX.cpp clang/lib/Parse/ParseOpenMP.cpp clang/lib/Sema/ParsedAttr.cpp clang/lib/Sema/SemaDeclAttr.cpp clang/lib/Sema/SemaTemplateInstantiateDecl.cpp clang/lib/Serialization/ASTReader.cpp clang/lib/Serialization/ASTWriter.cpp clang/test/AST/Interp/arrays.cpp clang/test/AST/Interp/cxx20.cpp clang/test/Driver/ve-toolchain.c clang/test/Driver/ve-toolchain.cpp clang/test/Preprocessor/has_attribute.cpp clang/test/SemaCXX/sugar-common-types.cpp clang/unittests/Format/FormatTest.cpp clang/utils/TableGen/ClangAttrEmitter.cpp compiler-rt/lib/scudo/standalone/chunk.h compiler-rt/lib/scudo/standalone/combined.h compiler-rt/lib/scudo/standalone/report.cpp compiler-rt/lib/scudo/standalone/report.h compiler-rt/lib/scudo/standalone/tests/chunk_test.cpp compiler-rt/lib/scudo/standalone/tests/report_test.cpp flang/include/flang/Runtime/descriptor.h flang/lib/Lower/ConvertVariable.cpp flang/lib/Lower/OpenACC.cpp flang/lib/Optimizer/CodeGen/CodeGen.cpp flang/runtime/descriptor.cpp libc/src/__support/FPUtil/FPBits.h libc/src/__support/FPUtil/except_value_utils.h libc/src/math/generic/exp.cpp libc/src/math/generic/expm1f.cpp libc/src/math/generic/log1pf.cpp llvm/lib/Analysis/InstructionSimplify.cpp llvm/lib/Analysis/ScalarEvolution.cpp llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h llvm/lib/CodeGen/MachineSink.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.h llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp llvm/lib/Target/NVPTX/NVPTXUtilities.cpp llvm/lib/Target/NVPTX/NVPTXUtilities.h llvm/lib/Transforms/IPO/AttributorAttributes.cpp llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/tools/lli/lli.cpp llvm/tools/llvm-jitlink/llvm-jitlink.cpp mlir/include/mlir/Dialect/Affine/IR/AffineOps.h mlir/lib/Conversion/AffineToStandard/AffineToStandard.cpp mlir/lib/Conversion/SCFToGPU/SCFToGPU.cpp mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp mlir/lib/Dialect/Affine/Analysis/AffineStructures.cpp mlir/lib/Dialect/Affine/Analysis/LoopAnalysis.cpp mlir/lib/Dialect/Affine/IR/AffineOps.cpp mlir/lib/Dialect/Affine/Transforms/PipelineDataTransfer.cpp mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp mlir/lib/Dialect/Affine/Utils/Utils.cpp mlir/lib/Target/LLVMIR/DebugImporter.cpp openmp/libomptarget/include/omptarget.h openmp/libomptarget/src/api.cpp openmp/libomptarget/src/private.h openmp/runtime/src/kmp_ftn_os.h llvm/include/llvm/ExecutionEngine/Orc/Debugging/DebuggerSupport.h llvm/include/llvm/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.h llvm/include/llvm/ExecutionEngine/Orc/Debugging/PerfSupportPlugin.h llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupport.cpp llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp llvm/lib/ExecutionEngine/Orc/Debugging/PerfSupportPlugin.cpp View the diff from clang-format here.diff --git a/clang/include/clang/Basic/SourceManager.h b/clang/include/clang/Basic/SourceManager.h
index 431f97f30525..98d194b1f8b0 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -649,7 +649,7 @@ class SourceManager : public RefCountedBase<SourceManager> {
/// This map allows us to merge ContentCache entries based
/// on their FileEntry*. All ContentCache objects will thus have unique,
/// non-null, FileEntry pointers.
- llvm::DenseMap<const FileEntry*, SrcMgr::ContentCache*> FileInfos;
+ llvm::DenseMap<const FileEntry *, SrcMgr::ContentCache *> FileInfos;
/// True if the ContentCache for files that are overridden by other
/// files, should report the original file name. Defaults to true.
@@ -1680,7 +1680,7 @@ public:
// Iterators over FileInfos.
using fileinfo_iterator =
- llvm::DenseMap<const FileEntry*, SrcMgr::ContentCache*>::const_iterator;
+ llvm::DenseMap<const FileEntry *, SrcMgr::ContentCache *>::const_iterator;
fileinfo_iterator fileinfo_begin() const { return FileInfos.begin(); }
fileinfo_iterator fileinfo_end() const { return FileInfos.end(); }
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index e13524b5f3b3..817c823af3df 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -2183,14 +2183,11 @@ public:
const FunctionProtoType *Old, SourceLocation OldLoc,
const FunctionProtoType *New, SourceLocation NewLoc);
bool handlerCanCatch(QualType HandlerType, QualType ExceptionType);
- bool CheckExceptionSpecSubset(const PartialDiagnostic &DiagID,
- const PartialDiagnostic &NestedDiagID,
- const PartialDiagnostic &NoteID,
- const PartialDiagnostic &NoThrowDiagID,
- const FunctionProtoType *Superset,
- SourceLocation SuperLoc,
- const FunctionProtoType *Subset,
- SourceLocation SubLoc);
+ bool CheckExceptionSpecSubset(
+ const PartialDiagnostic &DiagID, const PartialDiagnostic &NestedDiagID,
+ const PartialDiagnostic &NoteID, const PartialDiagnostic &NoThrowDiagID,
+ const FunctionProtoType *Superset, SourceLocation SuperLoc,
+ const FunctionProtoType *Subset, SourceLocation SubLoc);
bool CheckParamExceptionSpec(const PartialDiagnostic &NestedDiagID,
const PartialDiagnostic &NoteID,
const FunctionProtoType *Target,
@@ -3855,7 +3852,7 @@ public:
bool isObjCWritebackConversion(QualType FromType, QualType ToType,
QualType &ConvertedType);
bool IsBlockPointerConversion(QualType FromType, QualType ToType,
- QualType& ConvertedType);
+ QualType &ConvertedType);
bool FunctionParamTypesAreEqual(const FunctionProtoType *OldType,
const FunctionProtoType *NewType,
unsigned *ArgPos = nullptr,
@@ -4221,8 +4218,7 @@ public:
QualType DestTypeForComplaining = QualType(),
unsigned DiagIDForComplaining = 0);
- Expr *FixOverloadedFunctionReference(Expr *E,
- DeclAccessPair FoundDecl,
+ Expr *FixOverloadedFunctionReference(Expr *E, DeclAccessPair FoundDecl,
FunctionDecl *Fn);
ExprResult FixOverloadedFunctionReference(ExprResult,
DeclAccessPair FoundDecl,
@@ -9222,8 +9218,7 @@ public:
TemplateDeductionResult
DeduceTemplateArguments(FunctionTemplateDecl *FunctionTemplate,
- QualType ToType,
- CXXConversionDecl *&Specialization,
+ QualType ToType, CXXConversionDecl *&Specialization,
sema::TemplateDeductionInfo &Info);
TemplateDeductionResult
diff --git a/clang/lib/Basic/SourceManager.cpp b/clang/lib/Basic/SourceManager.cpp
index a630743e0d73..3ceff54c205c 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -324,8 +324,10 @@ SourceManager::~SourceManager() {
ContentCacheAlloc.Deallocate(MemBufferInfos[i]);
}
}
- for (llvm::DenseMap<const FileEntry*, SrcMgr::ContentCache*>::iterator
- I = FileInfos.begin(), E = FileInfos.end(); I != E; ++I) {
+ for (llvm::DenseMap<const FileEntry *, SrcMgr::ContentCache *>::iterator
+ I = FileInfos.begin(),
+ E = FileInfos.end();
+ I != E; ++I) {
if (I->second) {
I->second->~ContentCache();
ContentCacheAlloc.Deallocate(I->second);
@@ -2344,11 +2346,11 @@ SourceManager::MemoryBufferSizes SourceManager::getMemoryBufferSizes() const {
}
size_t SourceManager::getDataStructureSizes() const {
- size_t size = llvm::capacity_in_bytes(MemBufferInfos)
- + llvm::capacity_in_bytes(LocalSLocEntryTable)
- + llvm::capacity_in_bytes(LoadedSLocEntryTable)
- + llvm::capacity_in_bytes(SLocEntryLoaded)
- + llvm::capacity_in_bytes(FileInfos);
+ size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+ llvm::capacity_in_bytes(LocalSLocEntryTable) +
+ llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+ llvm::capacity_in_bytes(SLocEntryLoaded) +
+ llvm::capacity_in_bytes(FileInfos);
if (OverriddenFilesInfo)
size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp
index e54a19ebfdbb..a61f2ea4dca4 100644
--- a/clang/lib/Lex/HeaderSearch.cpp
+++ b/clang/lib/Lex/HeaderSearch.cpp
@@ -927,8 +927,8 @@ OptionalFileEntryRef HeaderSearch::LookupFile(
// from a module build. We should treat this as a system header if we're
// building a [system] module.
bool IncluderIsSystemHeader =
- Includer ? getFileInfo(Includer).DirInfo != SrcMgr::C_User :
- BuildSystemModule;
+ Includer ? getFileInfo(Includer).DirInfo != SrcMgr::C_User
+ : BuildSystemModule;
if (OptionalFileEntryRef FE = getFileAndSuggestModule(
TmpDir, IncludeLoc, IncluderAndDir.second, IncluderIsSystemHeader,
RequestingModule, SuggestedModule)) {
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 2d7e69946a39..ddcc290651d3 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -1336,9 +1336,9 @@ static void handleReturnTypestateAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
// FIXME: This check is currently being done in the analysis. It can be
// enabled here only after the parser propagates attributes at
// template specialization definition, not declaration.
- //QualType ReturnType;
+ // QualType ReturnType;
//
- //if (const ParmVarDecl *Param = dyn_cast<ParmVarDecl>(D)) {
+ // if (const ParmVarDecl *Param = dyn_cast<ParmVarDecl>(D)) {
// ReturnType = Param->getType();
//
//} else if (const CXXConstructorDecl *Constructor =
@@ -1350,9 +1350,9 @@ static void handleReturnTypestateAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
// ReturnType = cast<FunctionDecl>(D)->getCallResultType();
//}
//
- //const CXXRecordDecl *RD = ReturnType->getAsCXXRecordDecl();
+ // const CXXRecordDecl *RD = ReturnType->getAsCXXRecordDecl();
//
- //if (!RD || !RD->hasAttr<ConsumableAttr>()) {
+ // if (!RD || !RD->hasAttr<ConsumableAttr>()) {
// S.Diag(Attr.getLoc(), diag::warn_return_state_for_unconsumable_type) <<
// ReturnType.getAsString();
// return;
diff --git a/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp b/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp
index 236ba5114130..4ac1b8b5a72b 100644
--- a/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp
+++ b/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp
@@ -169,15 +169,13 @@ public:
// Try to parse line data. Consume error on failure.
if (auto Err = LineTable.parse(DebugLineData, &Offset, *DWARFCtx, nullptr,
consumeError)) {
- handleAllErrors(
- std::move(Err),
- [&](ErrorInfoBase &EIB) {
- LLVM_DEBUG({
- dbgs() << "Cannot parse line table for \"" << G.getName() << "\": ";
- EIB.log(dbgs());
- dbgs() << "\n";
- });
+ handleAllErrors(std::move(Err), [&](ErrorInfoBase &EIB) {
+ LLVM_DEBUG({
+ dbgs() << "Cannot parse line table for \"" << G.getName() << "\": ";
+ EIB.log(dbgs());
+ dbgs() << "\n";
});
+ });
} else {
if (!LineTable.Prologue.FileNames.empty())
FileName = *dwarf::toString(LineTable.Prologue.FileNames[0].Name);
|
If this was or will be landed upstream, we should be able to pick it up with our merge_from_main.sh process that I just completed. I did not see these merged. However, dry-run of this patch showed some conflict. Run get_pr_patches to see conflicts. Let me know if this is expected to be committed upstream soon. |
I have checked the conflicts and it happens in dllexports, where there's an ASO change that causes the conflict. I can easily create a workaround patch for that at some point. |
Merged via trunk. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds the newly landed OpenMP API routines
omp_target_memset()
andomp_target_memset_sync()
to fill memory in the device memory. This first implementation is based on a slow path that initializes memory on the host and then issues a H2D transfer to fill the memory on the target device. A better solution is being investigate at the moment.