Skip to content
This repository has been archived by the owner on Jan 20, 2024. It is now read-only.

[OpenMP] Add OpenMP v6.0 API Routines omp_target_memset() and omp_target_memset_sync() #239

Closed
wants to merge 7 commits into from

Conversation

mjklemm
Copy link
Contributor

@mjklemm mjklemm commented Oct 5, 2023

This PR adds the newly landed OpenMP API routines omp_target_memset() and omp_target_memset_sync() to fill memory in the device memory. This first implementation is based on a slow path that initializes memory on the host and then issues a H2D transfer to fill the memory on the target device. A better solution is being investigate at the moment.

There is a TODO to implement a fast path that uses an on-device
kernel instead of the host-based memory fill operation.  This may
require some additional plumbing to have kernels in libomptarget.so
@github-actions
Copy link

github-actions bot commented Oct 5, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 816c42d880d9d146c5d7cd6cd48a99324a8e2edf 012952ada089b623a7d361a00f74f787481ee9b0 -- clang/test/Layout/ms-no-unique-address.cpp clang/test/SemaCXX/cxx2a-ms-no-unique-address.cpp libc/src/math/expm1.h libc/src/math/generic/expm1.cpp libc/test/src/math/expm1_test.cpp libc/test/src/math/smoke/expm1_test.cpp llvm/include/llvm/ExecutionEngine/Orc/Debugging/DebugInfoSupport.h llvm/lib/ExecutionEngine/Orc/Debugging/DebugInfoSupport.cpp openmp/libomptarget/test/api/omp_target_memset.c clang/include/clang/Basic/ParsedAttrInfo.h clang/include/clang/Basic/SourceManager.h clang/include/clang/Lex/HeaderSearch.h clang/include/clang/Lex/ModuleMap.h clang/include/clang/Sema/Sema.h clang/lib/AST/ASTContext.cpp clang/lib/AST/Decl.cpp clang/lib/AST/Interp/Boolean.h clang/lib/AST/Interp/ByteCodeExprGen.cpp clang/lib/AST/Interp/Floating.h clang/lib/AST/Interp/Integral.h clang/lib/AST/Interp/Interp.h clang/lib/AST/Interp/InterpBuiltin.cpp clang/lib/AST/Interp/Pointer.h clang/lib/AST/RecordLayoutBuilder.cpp clang/lib/Analysis/ThreadSafety.cpp clang/lib/Basic/SourceManager.cpp clang/lib/Basic/Targets/NVPTX.h clang/lib/CodeGen/Targets/NVPTX.cpp clang/lib/Driver/ToolChains/CommonArgs.cpp clang/lib/Driver/ToolChains/VEToolchain.cpp clang/lib/Format/WhitespaceManager.cpp clang/lib/Frontend/FrontendAction.cpp clang/lib/Interpreter/IncrementalExecutor.cpp clang/lib/Lex/HeaderSearch.cpp clang/lib/Lex/ModuleMap.cpp clang/lib/Lex/PPDirectives.cpp clang/lib/Parse/ParseDeclCXX.cpp clang/lib/Parse/ParseOpenMP.cpp clang/lib/Sema/ParsedAttr.cpp clang/lib/Sema/SemaDeclAttr.cpp clang/lib/Sema/SemaTemplateInstantiateDecl.cpp clang/lib/Serialization/ASTReader.cpp clang/lib/Serialization/ASTWriter.cpp clang/test/AST/Interp/arrays.cpp clang/test/AST/Interp/cxx20.cpp clang/test/Driver/ve-toolchain.c clang/test/Driver/ve-toolchain.cpp clang/test/Preprocessor/has_attribute.cpp clang/test/SemaCXX/sugar-common-types.cpp clang/unittests/Format/FormatTest.cpp clang/utils/TableGen/ClangAttrEmitter.cpp compiler-rt/lib/scudo/standalone/chunk.h compiler-rt/lib/scudo/standalone/combined.h compiler-rt/lib/scudo/standalone/report.cpp compiler-rt/lib/scudo/standalone/report.h compiler-rt/lib/scudo/standalone/tests/chunk_test.cpp compiler-rt/lib/scudo/standalone/tests/report_test.cpp flang/include/flang/Runtime/descriptor.h flang/lib/Lower/ConvertVariable.cpp flang/lib/Lower/OpenACC.cpp flang/lib/Optimizer/CodeGen/CodeGen.cpp flang/runtime/descriptor.cpp libc/src/__support/FPUtil/FPBits.h libc/src/__support/FPUtil/except_value_utils.h libc/src/math/generic/exp.cpp libc/src/math/generic/expm1f.cpp libc/src/math/generic/log1pf.cpp llvm/lib/Analysis/InstructionSimplify.cpp llvm/lib/Analysis/ScalarEvolution.cpp llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h llvm/lib/CodeGen/MachineSink.cpp llvm/lib/Target/AArch64/AArch64ISelLowering.cpp llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.h llvm/lib/Target/AMDGPU/SIInstrInfo.h llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp llvm/lib/Target/NVPTX/NVPTXUtilities.cpp llvm/lib/Target/NVPTX/NVPTXUtilities.h llvm/lib/Transforms/IPO/AttributorAttributes.cpp llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp llvm/lib/Transforms/Utils/InlineFunction.cpp llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/tools/lli/lli.cpp llvm/tools/llvm-jitlink/llvm-jitlink.cpp mlir/include/mlir/Dialect/Affine/IR/AffineOps.h mlir/lib/Conversion/AffineToStandard/AffineToStandard.cpp mlir/lib/Conversion/SCFToGPU/SCFToGPU.cpp mlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp mlir/lib/Dialect/Affine/Analysis/AffineStructures.cpp mlir/lib/Dialect/Affine/Analysis/LoopAnalysis.cpp mlir/lib/Dialect/Affine/IR/AffineOps.cpp mlir/lib/Dialect/Affine/Transforms/PipelineDataTransfer.cpp mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp mlir/lib/Dialect/Affine/Utils/LoopUtils.cpp mlir/lib/Dialect/Affine/Utils/Utils.cpp mlir/lib/Target/LLVMIR/DebugImporter.cpp openmp/libomptarget/include/omptarget.h openmp/libomptarget/src/api.cpp openmp/libomptarget/src/private.h openmp/runtime/src/kmp_ftn_os.h llvm/include/llvm/ExecutionEngine/Orc/Debugging/DebuggerSupport.h llvm/include/llvm/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.h llvm/include/llvm/ExecutionEngine/Orc/Debugging/PerfSupportPlugin.h llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupport.cpp llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp llvm/lib/ExecutionEngine/Orc/Debugging/PerfSupportPlugin.cpp
View the diff from clang-format here.
diff --git a/clang/include/clang/Basic/SourceManager.h b/clang/include/clang/Basic/SourceManager.h
index 431f97f30525..98d194b1f8b0 100644
--- a/clang/include/clang/Basic/SourceManager.h
+++ b/clang/include/clang/Basic/SourceManager.h
@@ -649,7 +649,7 @@ class SourceManager : public RefCountedBase<SourceManager> {
   /// This map allows us to merge ContentCache entries based
   /// on their FileEntry*.  All ContentCache objects will thus have unique,
   /// non-null, FileEntry pointers.
-  llvm::DenseMap<const FileEntry*, SrcMgr::ContentCache*> FileInfos;
+  llvm::DenseMap<const FileEntry *, SrcMgr::ContentCache *> FileInfos;
 
   /// True if the ContentCache for files that are overridden by other
   /// files, should report the original file name. Defaults to true.
@@ -1680,7 +1680,7 @@ public:
 
   // Iterators over FileInfos.
   using fileinfo_iterator =
-      llvm::DenseMap<const FileEntry*, SrcMgr::ContentCache*>::const_iterator;
+      llvm::DenseMap<const FileEntry *, SrcMgr::ContentCache *>::const_iterator;
 
   fileinfo_iterator fileinfo_begin() const { return FileInfos.begin(); }
   fileinfo_iterator fileinfo_end() const { return FileInfos.end(); }
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index e13524b5f3b3..817c823af3df 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -2183,14 +2183,11 @@ public:
       const FunctionProtoType *Old, SourceLocation OldLoc,
       const FunctionProtoType *New, SourceLocation NewLoc);
   bool handlerCanCatch(QualType HandlerType, QualType ExceptionType);
-  bool CheckExceptionSpecSubset(const PartialDiagnostic &DiagID,
-                                const PartialDiagnostic &NestedDiagID,
-                                const PartialDiagnostic &NoteID,
-                                const PartialDiagnostic &NoThrowDiagID,
-                                const FunctionProtoType *Superset,
-                                SourceLocation SuperLoc,
-                                const FunctionProtoType *Subset,
-                                SourceLocation SubLoc);
+  bool CheckExceptionSpecSubset(
+      const PartialDiagnostic &DiagID, const PartialDiagnostic &NestedDiagID,
+      const PartialDiagnostic &NoteID, const PartialDiagnostic &NoThrowDiagID,
+      const FunctionProtoType *Superset, SourceLocation SuperLoc,
+      const FunctionProtoType *Subset, SourceLocation SubLoc);
   bool CheckParamExceptionSpec(const PartialDiagnostic &NestedDiagID,
                                const PartialDiagnostic &NoteID,
                                const FunctionProtoType *Target,
@@ -3855,7 +3852,7 @@ public:
   bool isObjCWritebackConversion(QualType FromType, QualType ToType,
                                  QualType &ConvertedType);
   bool IsBlockPointerConversion(QualType FromType, QualType ToType,
-                                QualType& ConvertedType);
+                                QualType &ConvertedType);
   bool FunctionParamTypesAreEqual(const FunctionProtoType *OldType,
                                   const FunctionProtoType *NewType,
                                   unsigned *ArgPos = nullptr,
@@ -4221,8 +4218,7 @@ public:
       QualType DestTypeForComplaining = QualType(),
       unsigned DiagIDForComplaining = 0);
 
-  Expr *FixOverloadedFunctionReference(Expr *E,
-                                       DeclAccessPair FoundDecl,
+  Expr *FixOverloadedFunctionReference(Expr *E, DeclAccessPair FoundDecl,
                                        FunctionDecl *Fn);
   ExprResult FixOverloadedFunctionReference(ExprResult,
                                             DeclAccessPair FoundDecl,
@@ -9222,8 +9218,7 @@ public:
 
   TemplateDeductionResult
   DeduceTemplateArguments(FunctionTemplateDecl *FunctionTemplate,
-                          QualType ToType,
-                          CXXConversionDecl *&Specialization,
+                          QualType ToType, CXXConversionDecl *&Specialization,
                           sema::TemplateDeductionInfo &Info);
 
   TemplateDeductionResult
diff --git a/clang/lib/Basic/SourceManager.cpp b/clang/lib/Basic/SourceManager.cpp
index a630743e0d73..3ceff54c205c 100644
--- a/clang/lib/Basic/SourceManager.cpp
+++ b/clang/lib/Basic/SourceManager.cpp
@@ -324,8 +324,10 @@ SourceManager::~SourceManager() {
       ContentCacheAlloc.Deallocate(MemBufferInfos[i]);
     }
   }
-  for (llvm::DenseMap<const FileEntry*, SrcMgr::ContentCache*>::iterator
-       I = FileInfos.begin(), E = FileInfos.end(); I != E; ++I) {
+  for (llvm::DenseMap<const FileEntry *, SrcMgr::ContentCache *>::iterator
+           I = FileInfos.begin(),
+           E = FileInfos.end();
+       I != E; ++I) {
     if (I->second) {
       I->second->~ContentCache();
       ContentCacheAlloc.Deallocate(I->second);
@@ -2344,11 +2346,11 @@ SourceManager::MemoryBufferSizes SourceManager::getMemoryBufferSizes() const {
 }
 
 size_t SourceManager::getDataStructureSizes() const {
-  size_t size = llvm::capacity_in_bytes(MemBufferInfos)
-    + llvm::capacity_in_bytes(LocalSLocEntryTable)
-    + llvm::capacity_in_bytes(LoadedSLocEntryTable)
-    + llvm::capacity_in_bytes(SLocEntryLoaded)
-    + llvm::capacity_in_bytes(FileInfos);
+  size_t size = llvm::capacity_in_bytes(MemBufferInfos) +
+                llvm::capacity_in_bytes(LocalSLocEntryTable) +
+                llvm::capacity_in_bytes(LoadedSLocEntryTable) +
+                llvm::capacity_in_bytes(SLocEntryLoaded) +
+                llvm::capacity_in_bytes(FileInfos);
 
   if (OverriddenFilesInfo)
     size += llvm::capacity_in_bytes(OverriddenFilesInfo->OverriddenFiles);
diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp
index e54a19ebfdbb..a61f2ea4dca4 100644
--- a/clang/lib/Lex/HeaderSearch.cpp
+++ b/clang/lib/Lex/HeaderSearch.cpp
@@ -927,8 +927,8 @@ OptionalFileEntryRef HeaderSearch::LookupFile(
       // from a module build. We should treat this as a system header if we're
       // building a [system] module.
       bool IncluderIsSystemHeader =
-          Includer ? getFileInfo(Includer).DirInfo != SrcMgr::C_User :
-          BuildSystemModule;
+          Includer ? getFileInfo(Includer).DirInfo != SrcMgr::C_User
+                   : BuildSystemModule;
       if (OptionalFileEntryRef FE = getFileAndSuggestModule(
               TmpDir, IncludeLoc, IncluderAndDir.second, IncluderIsSystemHeader,
               RequestingModule, SuggestedModule)) {
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 2d7e69946a39..ddcc290651d3 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -1336,9 +1336,9 @@ static void handleReturnTypestateAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
   // FIXME: This check is currently being done in the analysis.  It can be
   //        enabled here only after the parser propagates attributes at
   //        template specialization definition, not declaration.
-  //QualType ReturnType;
+  // QualType ReturnType;
   //
-  //if (const ParmVarDecl *Param = dyn_cast<ParmVarDecl>(D)) {
+  // if (const ParmVarDecl *Param = dyn_cast<ParmVarDecl>(D)) {
   //  ReturnType = Param->getType();
   //
   //} else if (const CXXConstructorDecl *Constructor =
@@ -1350,9 +1350,9 @@ static void handleReturnTypestateAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
   //  ReturnType = cast<FunctionDecl>(D)->getCallResultType();
   //}
   //
-  //const CXXRecordDecl *RD = ReturnType->getAsCXXRecordDecl();
+  // const CXXRecordDecl *RD = ReturnType->getAsCXXRecordDecl();
   //
-  //if (!RD || !RD->hasAttr<ConsumableAttr>()) {
+  // if (!RD || !RD->hasAttr<ConsumableAttr>()) {
   //    S.Diag(Attr.getLoc(), diag::warn_return_state_for_unconsumable_type) <<
   //      ReturnType.getAsString();
   //    return;
diff --git a/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp b/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp
index 236ba5114130..4ac1b8b5a72b 100644
--- a/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp
+++ b/llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupportPlugin.cpp
@@ -169,15 +169,13 @@ public:
       // Try to parse line data. Consume error on failure.
       if (auto Err = LineTable.parse(DebugLineData, &Offset, *DWARFCtx, nullptr,
                                      consumeError)) {
-        handleAllErrors(
-          std::move(Err),
-          [&](ErrorInfoBase &EIB) {
-            LLVM_DEBUG({
-              dbgs() << "Cannot parse line table for \"" << G.getName() << "\": ";
-              EIB.log(dbgs());
-              dbgs() << "\n";
-            });
+        handleAllErrors(std::move(Err), [&](ErrorInfoBase &EIB) {
+          LLVM_DEBUG({
+            dbgs() << "Cannot parse line table for \"" << G.getName() << "\": ";
+            EIB.log(dbgs());
+            dbgs() << "\n";
           });
+        });
       } else {
         if (!LineTable.Prologue.FileNames.empty())
           FileName = *dwarf::toString(LineTable.Prologue.FileNames[0].Name);

@gregrodgers
Copy link
Contributor

If this was or will be landed upstream, we should be able to pick it up with our merge_from_main.sh process that I just completed. I did not see these merged. However, dry-run of this patch showed some conflict. Run get_pr_patches to see conflicts.

Let me know if this is expected to be committed upstream soon.

@mjklemm
Copy link
Contributor Author

mjklemm commented Oct 6, 2023

I have checked the conflicts and it happens in dllexports, where there's an ASO change that causes the conflict. I can easily create a workaround patch for that at some point.

@mjklemm
Copy link
Contributor Author

mjklemm commented Jan 9, 2024

Merged via trunk.

@mjklemm mjklemm closed this Jan 9, 2024
@mjklemm mjklemm deleted the omp_target_memset branch January 11, 2024 21:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants