Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HLSL] Change default linkage of HLSL functions and groupshared variables #93336

Closed
wants to merge 8 commits into from

Conversation

hekota
Copy link
Member

@hekota hekota commented May 24, 2024

In DXC the default linkage of HLSL function is internal unless it is:
1. shader entry point function
2. marked with the export keyword (#92812)
3. does not have a definition

This PR implements DXC behavior about function linkage in Clang.

Note that because of the rule no.3 above, the linkage of functions cannot be determined until the whole translation unit is parsed. That is because during Clang Sema analysis the linkage of declarations is cached and cannot be changed during parsing based on whether a function definition is found or not. Therefore, all global HLSL functions have external linkage while in Clang Sema, and the final linkage is updated to internal based on the rules above during CodeGen.

This PR also changes the linkage of groupshared variables internal to match DXC behavior. Global variables marked static already have internal linkage per C++ rules.

Related spec update: microsoft/hlsl-specs#249

Fixes #92071

Copy link

github-actions bot commented May 24, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@hekota hekota force-pushed the internal-linkage-by-default branch from d68d1de to 2c29cb2 Compare May 24, 2024 20:50
@hekota hekota force-pushed the internal-linkage-by-default branch from 2c29cb2 to 9555f5a Compare June 7, 2024 21:29
@hekota hekota changed the title [HLSL] Change default linkage of HLSL functions to internal [HLSL] Change default linkage of HLSL functions and groupshared variables Jun 7, 2024
@hekota hekota marked this pull request as ready for review June 7, 2024 22:23
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen HLSL HLSL Language Support labels Jun 7, 2024
@llvmbot
Copy link

llvmbot commented Jun 7, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Helena Kotas (hekota)

Changes

In DXC the default linkage of HLSL function is internal unless it is:
1. shader entry point function
2. marked with the export keyword (#92812)
3. does not have a definition

This PR implements DXC behavior about function linkage in Clang.

Note that because of the rule no.3 above, the linkage of functions cannot be determined until the whole translation unit is parsed. That is because during Clang Sema analysis the linkage of declarations is cached and cannot be changed during parsing based on whether a function definition is found or not. Therefore, all global HLSL functions have external linkage while in Clang Sema, and the final linkage is updated to internal based on the rules above during CodeGen.

This PR also changes the linkage of groupshared variables internal to match DXC behavior. Global variables marked static already have internal linkage per C++ rules.

Related spec update: microsoft/hlsl-specs#249

Fixes #92071


Patch is 140.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93336.diff

41 Files Affected:

  • (modified) clang/docs/HLSL/ExpectedDifferences.rst (+13)
  • (modified) clang/lib/AST/Decl.cpp (+6)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.cpp (+15)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.h (+3-4)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+4-3)
  • (modified) clang/test/CodeGenHLSL/ArrayTemporary.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/abs.hlsl (+28-28)
  • (modified) clang/test/CodeGenHLSL/builtins/all.hlsl (+80-80)
  • (modified) clang/test/CodeGenHLSL/builtins/any.hlsl (+80-80)
  • (modified) clang/test/CodeGenHLSL/builtins/ceil.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/clamp.hlsl (+40-40)
  • (modified) clang/test/CodeGenHLSL/builtins/cos.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/exp.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/exp2.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/floor.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/frac.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/isinf.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/builtins/log.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/log10.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/log2.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/max.hlsl (+40-40)
  • (modified) clang/test/CodeGenHLSL/builtins/min.hlsl (+40-40)
  • (modified) clang/test/CodeGenHLSL/builtins/pow.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/rcp.hlsl (+32-32)
  • (modified) clang/test/CodeGenHLSL/builtins/reversebits.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/round.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/rsqrt.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/sin.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/sqrt.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/trunc.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/wave_get_lane_index_do_while.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/wave_get_lane_index_simple.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/wave_get_lane_index_subcall.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/convergence/do.while.hlsl (+5-5)
  • (modified) clang/test/CodeGenHLSL/convergence/for.hlsl (+7-7)
  • (modified) clang/test/CodeGenHLSL/convergence/while.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/no_int_promotion.hlsl (+7-7)
  • (modified) clang/test/CodeGenHLSL/shift-mask.hlsl (+4-4)
  • (modified) clang/test/CodeGenHLSL/this-assignment-overload.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/this-assignment.hlsl (+2-2)
  • (modified) clang/test/Options/enable_16bit_types_validation_spirv.hlsl (+1-1)
diff --git a/clang/docs/HLSL/ExpectedDifferences.rst b/clang/docs/HLSL/ExpectedDifferences.rst
index d1b6010f10f43..e0de62345bd8c 100644
--- a/clang/docs/HLSL/ExpectedDifferences.rst
+++ b/clang/docs/HLSL/ExpectedDifferences.rst
@@ -108,3 +108,16 @@ behavior between Clang and DXC. Some examples include:
   diagnostic notifying the user of the conversion rather than silently altering
   precision relative to the other overloads (as FXC does) or generating code
   that will fail validation (as DXC does).
+
+Correctness improvements (bug fixes)
+====================================
+
+Entry point functions & ``static`` keyword
+------------------------------------------
+Marking a shader entry point function ``static`` will result in an error.
+
+This is idential to DXC behavior when an entry point is specified as compiler
+argument. However, DXC does not report an error when compiling a shader library
+that has an entry point function with ``[shader("stage")]`` attribute that is
+also marked ``static``. Additionally, this function definition is not included
+in the final DXIL.
diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp
index 1f19dadafa44e..dc5566bab312c 100644
--- a/clang/lib/AST/Decl.cpp
+++ b/clang/lib/AST/Decl.cpp
@@ -621,6 +621,7 @@ LinkageComputer::getLVForNamespaceScopeDecl(const NamedDecl *D,
     // - a variable, variable template, function, or function template
     //   that is explicitly declared static; or
     // (This bullet corresponds to C99 6.2.2p3.)
+    // - also applies to HLSL
     return LinkageInfo::internal();
   }
 
@@ -657,6 +658,11 @@ LinkageComputer::getLVForNamespaceScopeDecl(const NamedDecl *D,
       if (PrevVar->getStorageClass() == SC_Static)
         return LinkageInfo::internal();
     }
+
+    if (Context.getLangOpts().HLSL &&
+        Var->hasAttr<HLSLGroupSharedAddressSpaceAttr>())
+      return LinkageInfo::internal();
+
   } else if (const auto *IFD = dyn_cast<IndirectFieldDecl>(D)) {
     //   - a data member of an anonymous union.
     const VarDecl *VD = IFD->getVarDecl();
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 5e6a3dd4878f4..173f0c1449c86 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -353,6 +353,21 @@ llvm::Value *CGHLSLRuntime::emitInputSemantic(IRBuilder<> &B,
   return nullptr;
 }
 
+void CGHLSLRuntime::emitFunctionProlog(const FunctionDecl *FD,
+                                       llvm::Function *Fn) {
+  if (!FD || !Fn)
+    return;
+
+  if (FD->hasAttr<HLSLShaderAttr>()) {
+    emitEntryFunction(FD, Fn);
+  } else {
+    // HLSL functions defined in the current translation unit that are not
+    // shader entry points or exported have internal linkage by default.
+    if (FD->isDefined())
+      Fn->setLinkage(GlobalValue::InternalLinkage);
+  }
+}
+
 void CGHLSLRuntime::emitEntryFunction(const FunctionDecl *FD,
                                       llvm::Function *Fn) {
   llvm::Module &M = CGM.getModule();
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index 0abe39dedcb96..4ffb5c00dd115 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -116,12 +116,11 @@ class CGHLSLRuntime {
   void addBuffer(const HLSLBufferDecl *D);
   void finishCodeGen();
 
-  void setHLSLEntryAttributes(const FunctionDecl *FD, llvm::Function *Fn);
-
-  void emitEntryFunction(const FunctionDecl *FD, llvm::Function *Fn);
-  void setHLSLFunctionAttributes(llvm::Function *, const FunctionDecl *);
+  void emitFunctionProlog(const FunctionDecl *FD, llvm::Function *Fn);
 
 private:
+  void emitEntryFunction(const FunctionDecl *FD, llvm::Function *Fn);
+  void setHLSLEntryAttributes(const FunctionDecl *FD, llvm::Function *Fn);
   void addBufferResourceAnnotation(llvm::GlobalVariable *GV,
                                    llvm::hlsl::ResourceClass RC,
                                    llvm::hlsl::ResourceKind RK, bool IsROV,
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index f0345f3b191b8..3ef21aa9c5b2b 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1194,9 +1194,10 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
     CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  // Handle emitting HLSL entry functions.
-  if (D && D->hasAttr<HLSLShaderAttr>())
-    CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
+  // Emit HLSL specific initialization
+  if (getLangOpts().HLSL) {
+    CGM.getHLSLRuntime().emitFunctionProlog(FD, Fn);
+  }
 
   EmitFunctionProlog(*CurFnInfo, CurFn, Args);
 
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index 63a30b61440eb..07076f72405f3 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -2,7 +2,7 @@
 
 void fn(float x[2]) { }
 
-// CHECK-LABEL: define void {{.*}}call{{.*}}
+// CHECK-LABEL: define internal void {{.*}}call{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [2 x float]
 // CHECK: [[Tmp:%.*]] = alloca [2 x float]
 // CHECK: call void @llvm.memset.p0.i32(ptr align 4 [[Arr]], i8 0, i32 8, i1 false)
@@ -20,7 +20,7 @@ struct Obj {
 
 void fn2(Obj O[4]) { }
 
-// CHECK-LABEL: define void {{.*}}call2{{.*}}
+// CHECK-LABEL: define internal void {{.*}}call2{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [4 x %struct.Obj]
 // CHECK: [[Tmp:%.*]] = alloca [4 x %struct.Obj]
 // CHECK: call void @llvm.memset.p0.i32(ptr align 4 [[Arr]], i8 0, i32 32, i1 false)
@@ -34,7 +34,7 @@ void call2() {
 
 void fn3(float x[2][2]) { }
 
-// CHECK-LABEL: define void {{.*}}call3{{.*}}
+// CHECK-LABEL: define internal void {{.*}}call3{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [2 x [2 x float]]
 // CHECK: [[Tmp:%.*]] = alloca [2 x [2 x float]]
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Arr]], ptr align 4 {{.*}}, i32 16, i1 false)
@@ -45,7 +45,7 @@ void call3() {
   fn3(Arr);
 }
 
-// CHECK-LABEL: define void {{.*}}call4{{.*}}(ptr
+// CHECK-LABEL: define internal void {{.*}}call4{{.*}}(ptr
 // CHECK-SAME: noundef byval([2 x [2 x float]]) align 4 [[Arr:%.*]])
 // CHECK: [[Tmp:%.*]] = alloca [2 x [2 x float]]
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[Arr]], i32 16, i1 false)
@@ -58,7 +58,7 @@ void call4(float Arr[2][2]) {
 // Verify that each template instantiation codegens to a unique and correctly
 // mangled function name.
 
-// CHECK-LABEL: define void {{.*}}template_call{{.*}}(ptr
+// CHECK-LABEL: define internal void {{.*}}template_call{{.*}}(ptr
 
 // CHECK-SAME: noundef byval([2 x float]) align 4 [[FA2:%[0-9A-Z]+]],
 // CHECK-SAME: ptr noundef byval([4 x float]) align 4 [[FA4:%[0-9A-Z]+]],
@@ -85,7 +85,7 @@ void template_call(float FA2[2], float FA4[4], int IA3[3]) {
 
 
 // Verify that Array parameter element access correctly codegens.
-// CHECK-LABEL: define void {{.*}}element_access{{.*}}(ptr
+// CHECK-LABEL: define internal void {{.*}}element_access{{.*}}(ptr
 // CHECK-SAME: noundef byval([2 x float]) align 4 [[FA2:%[0-9A-Z]+]]
 
 // CHECK: [[Addr:%.*]] = getelementptr inbounds [2 x float], ptr [[FA2]], i32 0, i32 0
diff --git a/clang/test/CodeGenHLSL/builtins/abs.hlsl b/clang/test/CodeGenHLSL/builtins/abs.hlsl
index ad65cab2721a2..1a38d9b6f6f7b 100644
--- a/clang/test/CodeGenHLSL/builtins/abs.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/abs.hlsl
@@ -9,85 +9,85 @@
 using hlsl::abs;
 
 #ifdef __HLSL_ENABLE_16_BIT
-// NATIVE_HALF: define noundef i16 @
+// NATIVE_HALF: define internal noundef i16 @
 // NATIVE_HALF: call i16 @llvm.abs.i16(
 int16_t test_abs_int16_t(int16_t p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <2 x i16> @
+// NATIVE_HALF: define internal noundef <2 x i16> @
 // NATIVE_HALF: call <2 x i16> @llvm.abs.v2i16(
 int16_t2 test_abs_int16_t2(int16_t2 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <3 x i16> @
+// NATIVE_HALF: define internal noundef <3 x i16> @
 // NATIVE_HALF: call <3 x i16> @llvm.abs.v3i16(
 int16_t3 test_abs_int16_t3(int16_t3 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <4 x i16> @
+// NATIVE_HALF: define internal noundef <4 x i16> @
 // NATIVE_HALF: call <4 x i16> @llvm.abs.v4i16(
 int16_t4 test_abs_int16_t4(int16_t4 p0) { return abs(p0); }
 #endif // __HLSL_ENABLE_16_BIT
 
-// NATIVE_HALF: define noundef half @
+// NATIVE_HALF: define internal noundef half @
 // NATIVE_HALF: call half @llvm.fabs.f16(
-// NO_HALF: define noundef float @"?test_abs_half@@YA$halff@$halff@@Z"(
+// NO_HALF: define internal noundef float @"?test_abs_half@@YA$halff@$halff@@Z"(
 // NO_HALF: call float @llvm.fabs.f32(float %0)
 half test_abs_half(half p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <2 x half> @
+// NATIVE_HALF: define internal noundef <2 x half> @
 // NATIVE_HALF: call <2 x half> @llvm.fabs.v2f16(
-// NO_HALF: define noundef <2 x float> @"?test_abs_half2@@YAT?$__vector@$halff@$01@__clang@@T12@@Z"(
+// NO_HALF: define internal noundef <2 x float> @"?test_abs_half2@@YAT?$__vector@$halff@$01@__clang@@T12@@Z"(
 // NO_HALF: call <2 x float> @llvm.fabs.v2f32(
 half2 test_abs_half2(half2 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <3 x half> @
+// NATIVE_HALF: define internal noundef <3 x half> @
 // NATIVE_HALF: call <3 x half> @llvm.fabs.v3f16(
-// NO_HALF: define noundef <3 x float> @"?test_abs_half3@@YAT?$__vector@$halff@$02@__clang@@T12@@Z"(
+// NO_HALF: define internal noundef <3 x float> @"?test_abs_half3@@YAT?$__vector@$halff@$02@__clang@@T12@@Z"(
 // NO_HALF: call <3 x float> @llvm.fabs.v3f32(
 half3 test_abs_half3(half3 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <4 x half> @
+// NATIVE_HALF: define internal noundef <4 x half> @
 // NATIVE_HALF: call <4 x half> @llvm.fabs.v4f16(
-// NO_HALF: define noundef <4 x float> @"?test_abs_half4@@YAT?$__vector@$halff@$03@__clang@@T12@@Z"(
+// NO_HALF: define internal noundef <4 x float> @"?test_abs_half4@@YAT?$__vector@$halff@$03@__clang@@T12@@Z"(
 // NO_HALF: call <4 x float> @llvm.fabs.v4f32(
 half4 test_abs_half4(half4 p0) { return abs(p0); }
-// CHECK: define noundef i32 @
+// CHECK: define internal noundef i32 @
 // CHECK: call i32 @llvm.abs.i32(
 int test_abs_int(int p0) { return abs(p0); }
-// CHECK: define noundef <2 x i32> @
+// CHECK: define internal noundef <2 x i32> @
 // CHECK: call <2 x i32> @llvm.abs.v2i32(
 int2 test_abs_int2(int2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x i32> @
+// CHECK: define internal noundef <3 x i32> @
 // CHECK: call <3 x i32> @llvm.abs.v3i32(
 int3 test_abs_int3(int3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x i32> @
+// CHECK: define internal noundef <4 x i32> @
 // CHECK: call <4 x i32> @llvm.abs.v4i32(
 int4 test_abs_int4(int4 p0) { return abs(p0); }
-// CHECK: define noundef float @
+// CHECK: define internal noundef float @
 // CHECK: call float @llvm.fabs.f32(
 float test_abs_float(float p0) { return abs(p0); }
-// CHECK: define noundef <2 x float> @
+// CHECK: define internal noundef <2 x float> @
 // CHECK: call <2 x float> @llvm.fabs.v2f32(
 float2 test_abs_float2(float2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x float> @
+// CHECK: define internal noundef <3 x float> @
 // CHECK: call <3 x float> @llvm.fabs.v3f32(
 float3 test_abs_float3(float3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x float> @
+// CHECK: define internal noundef <4 x float> @
 // CHECK: call <4 x float> @llvm.fabs.v4f32(
 float4 test_abs_float4(float4 p0) { return abs(p0); }
-// CHECK: define noundef i64 @
+// CHECK: define internal noundef i64 @
 // CHECK: call i64 @llvm.abs.i64(
 int64_t test_abs_int64_t(int64_t p0) { return abs(p0); }
-// CHECK: define noundef <2 x i64> @
+// CHECK: define internal noundef <2 x i64> @
 // CHECK: call <2 x i64> @llvm.abs.v2i64(
 int64_t2 test_abs_int64_t2(int64_t2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x i64> @
+// CHECK: define internal noundef <3 x i64> @
 // CHECK: call <3 x i64> @llvm.abs.v3i64(
 int64_t3 test_abs_int64_t3(int64_t3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x i64> @
+// CHECK: define internal noundef <4 x i64> @
 // CHECK: call <4 x i64> @llvm.abs.v4i64(
 int64_t4 test_abs_int64_t4(int64_t4 p0) { return abs(p0); }
-// CHECK: define noundef double @
+// CHECK: define internal noundef double @
 // CHECK: call double @llvm.fabs.f64(
 double test_abs_double(double p0) { return abs(p0); }
-// CHECK: define noundef <2 x double> @
+// CHECK: define internal noundef <2 x double> @
 // CHECK: call <2 x double> @llvm.fabs.v2f64(
 double2 test_abs_double2(double2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x double> @
+// CHECK: define internal noundef <3 x double> @
 // CHECK: call <3 x double> @llvm.fabs.v3f64(
 double3 test_abs_double3(double3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x double> @
+// CHECK: define internal noundef <4 x double> @
 // CHECK: call <4 x double> @llvm.fabs.v4f64(
 double4 test_abs_double4(double4 p0) { return abs(p0); }
diff --git a/clang/test/CodeGenHLSL/builtins/all.hlsl b/clang/test/CodeGenHLSL/builtins/all.hlsl
index b48daa287480f..8437199a7da52 100644
--- a/clang/test/CodeGenHLSL/builtins/all.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/all.hlsl
@@ -14,59 +14,59 @@
 // RUN:   -o - | FileCheck %s --check-prefixes=CHECK,DXIL_NO_HALF,DXIL_CHECK
 
 #ifdef __HLSL_ENABLE_16_BIT
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t(int16_t p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v2i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v2i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t2(int16_t2 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v3i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v3i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t3(int16_t3 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v4i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v4i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t4(int16_t4 p0) { return all(p0); }
 
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t(uint16_t p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v2i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v2i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t2(uint16_t2 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v3i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v3i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t3(uint16_t3 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v4i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v4i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t4(uint16_t4 p0) { return all(p0); }
 #endif // __HLSL_ENABLE_16_BIT
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.f32
@@ -74,8 +74,8 @@ bool test_all_uint16_t4(uint16_t4 p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half(half p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v2f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v2f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.v2f32
@@ -83,8 +83,8 @@ bool test_all_half(half p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half2(half2 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v3f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v3f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.v3f32
@@ -92,8 +92,8 @@ bool test_all_half2(half2 p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half3(half3 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v4f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v4f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.v4f32
@@ -101,176 +101,176 @@ bool test_all_half3(half3 p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half4(half4 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float(float p0) { return all(p0); }
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.v2f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.v2f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float2(float2 p0) { return all(p0); }
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.v3f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.v3f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float3(float3 p0) { return all(p0); }
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.v4f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.v4f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float4(float4 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_fun...
[truncated]

@llvmbot
Copy link

llvmbot commented Jun 7, 2024

@llvm/pr-subscribers-hlsl

Author: Helena Kotas (hekota)

Changes

In DXC the default linkage of HLSL function is internal unless it is:
1. shader entry point function
2. marked with the export keyword (#92812)
3. does not have a definition

This PR implements DXC behavior about function linkage in Clang.

Note that because of the rule no.3 above, the linkage of functions cannot be determined until the whole translation unit is parsed. That is because during Clang Sema analysis the linkage of declarations is cached and cannot be changed during parsing based on whether a function definition is found or not. Therefore, all global HLSL functions have external linkage while in Clang Sema, and the final linkage is updated to internal based on the rules above during CodeGen.

This PR also changes the linkage of groupshared variables internal to match DXC behavior. Global variables marked static already have internal linkage per C++ rules.

Related spec update: microsoft/hlsl-specs#249

Fixes #92071


Patch is 140.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93336.diff

41 Files Affected:

  • (modified) clang/docs/HLSL/ExpectedDifferences.rst (+13)
  • (modified) clang/lib/AST/Decl.cpp (+6)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.cpp (+15)
  • (modified) clang/lib/CodeGen/CGHLSLRuntime.h (+3-4)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+4-3)
  • (modified) clang/test/CodeGenHLSL/ArrayTemporary.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/builtins/abs.hlsl (+28-28)
  • (modified) clang/test/CodeGenHLSL/builtins/all.hlsl (+80-80)
  • (modified) clang/test/CodeGenHLSL/builtins/any.hlsl (+80-80)
  • (modified) clang/test/CodeGenHLSL/builtins/ceil.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/clamp.hlsl (+40-40)
  • (modified) clang/test/CodeGenHLSL/builtins/cos.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/exp.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/exp2.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/floor.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/frac.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/isinf.hlsl (+8-8)
  • (modified) clang/test/CodeGenHLSL/builtins/log.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/log10.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/log2.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/max.hlsl (+40-40)
  • (modified) clang/test/CodeGenHLSL/builtins/min.hlsl (+40-40)
  • (modified) clang/test/CodeGenHLSL/builtins/pow.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/rcp.hlsl (+32-32)
  • (modified) clang/test/CodeGenHLSL/builtins/reversebits.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/round.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/rsqrt.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/sin.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/sqrt.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/trunc.hlsl (+12-12)
  • (modified) clang/test/CodeGenHLSL/builtins/wave_get_lane_index_do_while.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/wave_get_lane_index_simple.hlsl (+1-1)
  • (modified) clang/test/CodeGenHLSL/builtins/wave_get_lane_index_subcall.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/convergence/do.while.hlsl (+5-5)
  • (modified) clang/test/CodeGenHLSL/convergence/for.hlsl (+7-7)
  • (modified) clang/test/CodeGenHLSL/convergence/while.hlsl (+6-6)
  • (modified) clang/test/CodeGenHLSL/no_int_promotion.hlsl (+7-7)
  • (modified) clang/test/CodeGenHLSL/shift-mask.hlsl (+4-4)
  • (modified) clang/test/CodeGenHLSL/this-assignment-overload.hlsl (+2-2)
  • (modified) clang/test/CodeGenHLSL/this-assignment.hlsl (+2-2)
  • (modified) clang/test/Options/enable_16bit_types_validation_spirv.hlsl (+1-1)
diff --git a/clang/docs/HLSL/ExpectedDifferences.rst b/clang/docs/HLSL/ExpectedDifferences.rst
index d1b6010f10f43..e0de62345bd8c 100644
--- a/clang/docs/HLSL/ExpectedDifferences.rst
+++ b/clang/docs/HLSL/ExpectedDifferences.rst
@@ -108,3 +108,16 @@ behavior between Clang and DXC. Some examples include:
   diagnostic notifying the user of the conversion rather than silently altering
   precision relative to the other overloads (as FXC does) or generating code
   that will fail validation (as DXC does).
+
+Correctness improvements (bug fixes)
+====================================
+
+Entry point functions & ``static`` keyword
+------------------------------------------
+Marking a shader entry point function ``static`` will result in an error.
+
+This is idential to DXC behavior when an entry point is specified as compiler
+argument. However, DXC does not report an error when compiling a shader library
+that has an entry point function with ``[shader("stage")]`` attribute that is
+also marked ``static``. Additionally, this function definition is not included
+in the final DXIL.
diff --git a/clang/lib/AST/Decl.cpp b/clang/lib/AST/Decl.cpp
index 1f19dadafa44e..dc5566bab312c 100644
--- a/clang/lib/AST/Decl.cpp
+++ b/clang/lib/AST/Decl.cpp
@@ -621,6 +621,7 @@ LinkageComputer::getLVForNamespaceScopeDecl(const NamedDecl *D,
     // - a variable, variable template, function, or function template
     //   that is explicitly declared static; or
     // (This bullet corresponds to C99 6.2.2p3.)
+    // - also applies to HLSL
     return LinkageInfo::internal();
   }
 
@@ -657,6 +658,11 @@ LinkageComputer::getLVForNamespaceScopeDecl(const NamedDecl *D,
       if (PrevVar->getStorageClass() == SC_Static)
         return LinkageInfo::internal();
     }
+
+    if (Context.getLangOpts().HLSL &&
+        Var->hasAttr<HLSLGroupSharedAddressSpaceAttr>())
+      return LinkageInfo::internal();
+
   } else if (const auto *IFD = dyn_cast<IndirectFieldDecl>(D)) {
     //   - a data member of an anonymous union.
     const VarDecl *VD = IFD->getVarDecl();
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.cpp b/clang/lib/CodeGen/CGHLSLRuntime.cpp
index 5e6a3dd4878f4..173f0c1449c86 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.cpp
+++ b/clang/lib/CodeGen/CGHLSLRuntime.cpp
@@ -353,6 +353,21 @@ llvm::Value *CGHLSLRuntime::emitInputSemantic(IRBuilder<> &B,
   return nullptr;
 }
 
+void CGHLSLRuntime::emitFunctionProlog(const FunctionDecl *FD,
+                                       llvm::Function *Fn) {
+  if (!FD || !Fn)
+    return;
+
+  if (FD->hasAttr<HLSLShaderAttr>()) {
+    emitEntryFunction(FD, Fn);
+  } else {
+    // HLSL functions defined in the current translation unit that are not
+    // shader entry points or exported have internal linkage by default.
+    if (FD->isDefined())
+      Fn->setLinkage(GlobalValue::InternalLinkage);
+  }
+}
+
 void CGHLSLRuntime::emitEntryFunction(const FunctionDecl *FD,
                                       llvm::Function *Fn) {
   llvm::Module &M = CGM.getModule();
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h b/clang/lib/CodeGen/CGHLSLRuntime.h
index 0abe39dedcb96..4ffb5c00dd115 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -116,12 +116,11 @@ class CGHLSLRuntime {
   void addBuffer(const HLSLBufferDecl *D);
   void finishCodeGen();
 
-  void setHLSLEntryAttributes(const FunctionDecl *FD, llvm::Function *Fn);
-
-  void emitEntryFunction(const FunctionDecl *FD, llvm::Function *Fn);
-  void setHLSLFunctionAttributes(llvm::Function *, const FunctionDecl *);
+  void emitFunctionProlog(const FunctionDecl *FD, llvm::Function *Fn);
 
 private:
+  void emitEntryFunction(const FunctionDecl *FD, llvm::Function *Fn);
+  void setHLSLEntryAttributes(const FunctionDecl *FD, llvm::Function *Fn);
   void addBufferResourceAnnotation(llvm::GlobalVariable *GV,
                                    llvm::hlsl::ResourceClass RC,
                                    llvm::hlsl::ResourceKind RK, bool IsROV,
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index f0345f3b191b8..3ef21aa9c5b2b 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1194,9 +1194,10 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
   if (getLangOpts().OpenMP && CurCodeDecl)
     CGM.getOpenMPRuntime().emitFunctionProlog(*this, CurCodeDecl);
 
-  // Handle emitting HLSL entry functions.
-  if (D && D->hasAttr<HLSLShaderAttr>())
-    CGM.getHLSLRuntime().emitEntryFunction(FD, Fn);
+  // Emit HLSL specific initialization
+  if (getLangOpts().HLSL) {
+    CGM.getHLSLRuntime().emitFunctionProlog(FD, Fn);
+  }
 
   EmitFunctionProlog(*CurFnInfo, CurFn, Args);
 
diff --git a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
index 63a30b61440eb..07076f72405f3 100644
--- a/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
+++ b/clang/test/CodeGenHLSL/ArrayTemporary.hlsl
@@ -2,7 +2,7 @@
 
 void fn(float x[2]) { }
 
-// CHECK-LABEL: define void {{.*}}call{{.*}}
+// CHECK-LABEL: define internal void {{.*}}call{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [2 x float]
 // CHECK: [[Tmp:%.*]] = alloca [2 x float]
 // CHECK: call void @llvm.memset.p0.i32(ptr align 4 [[Arr]], i8 0, i32 8, i1 false)
@@ -20,7 +20,7 @@ struct Obj {
 
 void fn2(Obj O[4]) { }
 
-// CHECK-LABEL: define void {{.*}}call2{{.*}}
+// CHECK-LABEL: define internal void {{.*}}call2{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [4 x %struct.Obj]
 // CHECK: [[Tmp:%.*]] = alloca [4 x %struct.Obj]
 // CHECK: call void @llvm.memset.p0.i32(ptr align 4 [[Arr]], i8 0, i32 32, i1 false)
@@ -34,7 +34,7 @@ void call2() {
 
 void fn3(float x[2][2]) { }
 
-// CHECK-LABEL: define void {{.*}}call3{{.*}}
+// CHECK-LABEL: define internal void {{.*}}call3{{.*}}
 // CHECK: [[Arr:%.*]] = alloca [2 x [2 x float]]
 // CHECK: [[Tmp:%.*]] = alloca [2 x [2 x float]]
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Arr]], ptr align 4 {{.*}}, i32 16, i1 false)
@@ -45,7 +45,7 @@ void call3() {
   fn3(Arr);
 }
 
-// CHECK-LABEL: define void {{.*}}call4{{.*}}(ptr
+// CHECK-LABEL: define internal void {{.*}}call4{{.*}}(ptr
 // CHECK-SAME: noundef byval([2 x [2 x float]]) align 4 [[Arr:%.*]])
 // CHECK: [[Tmp:%.*]] = alloca [2 x [2 x float]]
 // CHECK: call void @llvm.memcpy.p0.p0.i32(ptr align 4 [[Tmp]], ptr align 4 [[Arr]], i32 16, i1 false)
@@ -58,7 +58,7 @@ void call4(float Arr[2][2]) {
 // Verify that each template instantiation codegens to a unique and correctly
 // mangled function name.
 
-// CHECK-LABEL: define void {{.*}}template_call{{.*}}(ptr
+// CHECK-LABEL: define internal void {{.*}}template_call{{.*}}(ptr
 
 // CHECK-SAME: noundef byval([2 x float]) align 4 [[FA2:%[0-9A-Z]+]],
 // CHECK-SAME: ptr noundef byval([4 x float]) align 4 [[FA4:%[0-9A-Z]+]],
@@ -85,7 +85,7 @@ void template_call(float FA2[2], float FA4[4], int IA3[3]) {
 
 
 // Verify that Array parameter element access correctly codegens.
-// CHECK-LABEL: define void {{.*}}element_access{{.*}}(ptr
+// CHECK-LABEL: define internal void {{.*}}element_access{{.*}}(ptr
 // CHECK-SAME: noundef byval([2 x float]) align 4 [[FA2:%[0-9A-Z]+]]
 
 // CHECK: [[Addr:%.*]] = getelementptr inbounds [2 x float], ptr [[FA2]], i32 0, i32 0
diff --git a/clang/test/CodeGenHLSL/builtins/abs.hlsl b/clang/test/CodeGenHLSL/builtins/abs.hlsl
index ad65cab2721a2..1a38d9b6f6f7b 100644
--- a/clang/test/CodeGenHLSL/builtins/abs.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/abs.hlsl
@@ -9,85 +9,85 @@
 using hlsl::abs;
 
 #ifdef __HLSL_ENABLE_16_BIT
-// NATIVE_HALF: define noundef i16 @
+// NATIVE_HALF: define internal noundef i16 @
 // NATIVE_HALF: call i16 @llvm.abs.i16(
 int16_t test_abs_int16_t(int16_t p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <2 x i16> @
+// NATIVE_HALF: define internal noundef <2 x i16> @
 // NATIVE_HALF: call <2 x i16> @llvm.abs.v2i16(
 int16_t2 test_abs_int16_t2(int16_t2 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <3 x i16> @
+// NATIVE_HALF: define internal noundef <3 x i16> @
 // NATIVE_HALF: call <3 x i16> @llvm.abs.v3i16(
 int16_t3 test_abs_int16_t3(int16_t3 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <4 x i16> @
+// NATIVE_HALF: define internal noundef <4 x i16> @
 // NATIVE_HALF: call <4 x i16> @llvm.abs.v4i16(
 int16_t4 test_abs_int16_t4(int16_t4 p0) { return abs(p0); }
 #endif // __HLSL_ENABLE_16_BIT
 
-// NATIVE_HALF: define noundef half @
+// NATIVE_HALF: define internal noundef half @
 // NATIVE_HALF: call half @llvm.fabs.f16(
-// NO_HALF: define noundef float @"?test_abs_half@@YA$halff@$halff@@Z"(
+// NO_HALF: define internal noundef float @"?test_abs_half@@YA$halff@$halff@@Z"(
 // NO_HALF: call float @llvm.fabs.f32(float %0)
 half test_abs_half(half p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <2 x half> @
+// NATIVE_HALF: define internal noundef <2 x half> @
 // NATIVE_HALF: call <2 x half> @llvm.fabs.v2f16(
-// NO_HALF: define noundef <2 x float> @"?test_abs_half2@@YAT?$__vector@$halff@$01@__clang@@T12@@Z"(
+// NO_HALF: define internal noundef <2 x float> @"?test_abs_half2@@YAT?$__vector@$halff@$01@__clang@@T12@@Z"(
 // NO_HALF: call <2 x float> @llvm.fabs.v2f32(
 half2 test_abs_half2(half2 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <3 x half> @
+// NATIVE_HALF: define internal noundef <3 x half> @
 // NATIVE_HALF: call <3 x half> @llvm.fabs.v3f16(
-// NO_HALF: define noundef <3 x float> @"?test_abs_half3@@YAT?$__vector@$halff@$02@__clang@@T12@@Z"(
+// NO_HALF: define internal noundef <3 x float> @"?test_abs_half3@@YAT?$__vector@$halff@$02@__clang@@T12@@Z"(
 // NO_HALF: call <3 x float> @llvm.fabs.v3f32(
 half3 test_abs_half3(half3 p0) { return abs(p0); }
-// NATIVE_HALF: define noundef <4 x half> @
+// NATIVE_HALF: define internal noundef <4 x half> @
 // NATIVE_HALF: call <4 x half> @llvm.fabs.v4f16(
-// NO_HALF: define noundef <4 x float> @"?test_abs_half4@@YAT?$__vector@$halff@$03@__clang@@T12@@Z"(
+// NO_HALF: define internal noundef <4 x float> @"?test_abs_half4@@YAT?$__vector@$halff@$03@__clang@@T12@@Z"(
 // NO_HALF: call <4 x float> @llvm.fabs.v4f32(
 half4 test_abs_half4(half4 p0) { return abs(p0); }
-// CHECK: define noundef i32 @
+// CHECK: define internal noundef i32 @
 // CHECK: call i32 @llvm.abs.i32(
 int test_abs_int(int p0) { return abs(p0); }
-// CHECK: define noundef <2 x i32> @
+// CHECK: define internal noundef <2 x i32> @
 // CHECK: call <2 x i32> @llvm.abs.v2i32(
 int2 test_abs_int2(int2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x i32> @
+// CHECK: define internal noundef <3 x i32> @
 // CHECK: call <3 x i32> @llvm.abs.v3i32(
 int3 test_abs_int3(int3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x i32> @
+// CHECK: define internal noundef <4 x i32> @
 // CHECK: call <4 x i32> @llvm.abs.v4i32(
 int4 test_abs_int4(int4 p0) { return abs(p0); }
-// CHECK: define noundef float @
+// CHECK: define internal noundef float @
 // CHECK: call float @llvm.fabs.f32(
 float test_abs_float(float p0) { return abs(p0); }
-// CHECK: define noundef <2 x float> @
+// CHECK: define internal noundef <2 x float> @
 // CHECK: call <2 x float> @llvm.fabs.v2f32(
 float2 test_abs_float2(float2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x float> @
+// CHECK: define internal noundef <3 x float> @
 // CHECK: call <3 x float> @llvm.fabs.v3f32(
 float3 test_abs_float3(float3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x float> @
+// CHECK: define internal noundef <4 x float> @
 // CHECK: call <4 x float> @llvm.fabs.v4f32(
 float4 test_abs_float4(float4 p0) { return abs(p0); }
-// CHECK: define noundef i64 @
+// CHECK: define internal noundef i64 @
 // CHECK: call i64 @llvm.abs.i64(
 int64_t test_abs_int64_t(int64_t p0) { return abs(p0); }
-// CHECK: define noundef <2 x i64> @
+// CHECK: define internal noundef <2 x i64> @
 // CHECK: call <2 x i64> @llvm.abs.v2i64(
 int64_t2 test_abs_int64_t2(int64_t2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x i64> @
+// CHECK: define internal noundef <3 x i64> @
 // CHECK: call <3 x i64> @llvm.abs.v3i64(
 int64_t3 test_abs_int64_t3(int64_t3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x i64> @
+// CHECK: define internal noundef <4 x i64> @
 // CHECK: call <4 x i64> @llvm.abs.v4i64(
 int64_t4 test_abs_int64_t4(int64_t4 p0) { return abs(p0); }
-// CHECK: define noundef double @
+// CHECK: define internal noundef double @
 // CHECK: call double @llvm.fabs.f64(
 double test_abs_double(double p0) { return abs(p0); }
-// CHECK: define noundef <2 x double> @
+// CHECK: define internal noundef <2 x double> @
 // CHECK: call <2 x double> @llvm.fabs.v2f64(
 double2 test_abs_double2(double2 p0) { return abs(p0); }
-// CHECK: define noundef <3 x double> @
+// CHECK: define internal noundef <3 x double> @
 // CHECK: call <3 x double> @llvm.fabs.v3f64(
 double3 test_abs_double3(double3 p0) { return abs(p0); }
-// CHECK: define noundef <4 x double> @
+// CHECK: define internal noundef <4 x double> @
 // CHECK: call <4 x double> @llvm.fabs.v4f64(
 double4 test_abs_double4(double4 p0) { return abs(p0); }
diff --git a/clang/test/CodeGenHLSL/builtins/all.hlsl b/clang/test/CodeGenHLSL/builtins/all.hlsl
index b48daa287480f..8437199a7da52 100644
--- a/clang/test/CodeGenHLSL/builtins/all.hlsl
+++ b/clang/test/CodeGenHLSL/builtins/all.hlsl
@@ -14,59 +14,59 @@
 // RUN:   -o - | FileCheck %s --check-prefixes=CHECK,DXIL_NO_HALF,DXIL_CHECK
 
 #ifdef __HLSL_ENABLE_16_BIT
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t(int16_t p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v2i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v2i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t2(int16_t2 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v3i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v3i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t3(int16_t3 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v4i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v4i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_int16_t4(int16_t4 p0) { return all(p0); }
 
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t(uint16_t p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v2i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v2i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t2(uint16_t2 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v3i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v3i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t3(uint16_t3 p0) { return all(p0); }
-// DXIL_NATIVE_HALF: define noundef i1 @
-// SPIR_NATIVE_HALF: define spir_func noundef i1 @
+// DXIL_NATIVE_HALF: define internal noundef i1 @
+// SPIR_NATIVE_HALF: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v4i16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v4i16
 // NATIVE_HALF: ret i1 %hlsl.all
 bool test_all_uint16_t4(uint16_t4 p0) { return all(p0); }
 #endif // __HLSL_ENABLE_16_BIT
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.f32
@@ -74,8 +74,8 @@ bool test_all_uint16_t4(uint16_t4 p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half(half p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v2f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v2f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.v2f32
@@ -83,8 +83,8 @@ bool test_all_half(half p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half2(half2 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v3f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v3f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.v3f32
@@ -92,8 +92,8 @@ bool test_all_half2(half2 p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half3(half3 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_NATIVE_HALF: %hlsl.all = call i1 @llvm.dx.all.v4f16
 // SPIR_NATIVE_HALF: %hlsl.all = call i1 @llvm.spv.all.v4f16
 // DXIL_NO_HALF: %hlsl.all = call i1 @llvm.dx.all.v4f32
@@ -101,176 +101,176 @@ bool test_all_half3(half3 p0) { return all(p0); }
 // CHECK: ret i1 %hlsl.all
 bool test_all_half4(half4 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float(float p0) { return all(p0); }
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.v2f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.v2f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float2(float2 p0) { return all(p0); }
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.v3f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.v3f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float3(float3 p0) { return all(p0); }
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_func noundef i1 @
+// DXIL_CHECK: define internal noundef i1 @
+// SPIR_CHECK: define internal spir_func noundef i1 @
 // DXIL_CHECK: %hlsl.all = call i1 @llvm.dx.all.v4f32
 // SPIR_CHECK: %hlsl.all = call i1 @llvm.spv.all.v4f32
 // CHECK: ret i1 %hlsl.all
 bool test_all_float4(float4 p0) { return all(p0); }
 
-// DXIL_CHECK: define noundef i1 @
-// SPIR_CHECK: define spir_fun...
[truncated]

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Jun 9, 2024

I'm not sure messing with the linkage this way will interact well with various C++ features. In particular, there are various places that check the "language linkage", which you're not modifying: from the perspective of anything outside of codegen, these functions are just external. So other parts of the code won't be aware of the correct linkage. Off the top of my head. this affects mangling, the linkage of template functions, and various warnings.

I'm not sure how much we can do in terms of actually solving this, given the way clang is architected, but we should try to come up with some sort of plan going forward.


Entry point functions & ``static`` keyword
------------------------------------------
Marking a shader entry point function ``static`` will result in an error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test for this already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes:

// expected-warning@+1 {{'shader' attribute only applies to global functions}}
[shader("vertex")]
static void oops() {}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Is it significant that these are warnings, while the rst file says errors?

@hekota hekota marked this pull request as draft June 18, 2024 19:29
@hekota hekota closed this Jul 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[HLSL] Default linkage of HLSL function should be internal
5 participants