[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2

TylerNowicki · 2024-09-03T17:08:36Z

Split buildCoroutineFrame into code related to normalization and code
related to actually building the coroutine frame.
This will enable future specialization of buildCoroutineFrame for
different ABIs.

See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

…llvm#94981) This extends default argument deduction to cover class templates as well, applying only to partial ordering, adding to the provisional wording introduced in llvm#89807. This solves some ambuguity introduced in P0522 regarding how template template parameters are partially ordered, and should reduce the negative impact of enabling `-frelaxed-template-template-args` by default. Given the following example: ```C++ template <class T1, class T2 = float> struct A; template <class T3> struct B; template <template <class T4> class TT1, class T5> struct B<TT1<T5>>; // #1 template <class T6, class T7> struct B<A<T6, T7>>; // #2 template struct B<A<int>>; ``` Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This patch restores the pre-P0522 behavior, `#2` is picked again.

… 'operator->' in the current instantiation (llvm#104458) Currently, clang erroneously rejects the following: ``` struct A { template<typename T> void f(); }; template<typename T> struct B { void g() { (*this)->template f<int>(); // error: no member named 'f' in 'B<T>' } A* operator->(); }; ``` This happens because `Sema::ActOnStartCXXMemberReference` does not adjust the `ObjectType` parameter when `ObjectType` is a dependent type (except when the type is a `PointerType` and the class member access is the `->` form). Since the (possibly adjusted) `ObjectType` parameter (`B<T>` in the above example) is passed to `Parser::ParseOptionalCXXScopeSpecifier`, we end up looking up `f` in `B` rather than `A`. This patch fixes the issue by identifying cases where the type of the object expression `T` is a dependent, non-pointer type and: - `T` is the current instantiation and lookup for `operator->` finds a member of the current instantiation, or - `T` has at least one dependent base case, and `operator->` is not found in the current instantiation and using `ASTContext::DependentTy` as the type of the object expression when the optional _nested-name-specifier_ is parsed. Fixes llvm#104268.

…llvm#107572) As reported in llvm#103714 (comment). CSA crashes on trying to bind value to symbolic region with `void *`. This happens when such region gets passed as inline asm input and engine tries to bind `UnknownVal` to that region. Fix it by changing type from void to char before calling `GetElementZeroRegion`

…lvm#107780) In the case of `--wasm64` we were setting the type of the init expression to be 64-bit but were only setting the low 32-bits of the value (by assigning to Int32). Fixes: emscripten-core/emscripten#22538

These packaged are imported by LLVMConfig.cmake and so we should be passing through the necessary variables from the parent build into the subbuilds. We use `CMAKE_CACHE_DEFAULT_ARGS` so subbuilds can override these variables if needed.

…in map (llvm#107735) Currently, `ValueIdToValueInfoMap` [1] stores `std::tuple<ValueInfo, GlobalValue::GUID /* original GUID */, GlobalValue::GUID /* real GUID*/ >`. This change updates the stored value type to `std::pair<ValueInfo, GlobalValue::GUID /* original GUID */>`, and reads real GUID from ValueInfo. When an entry is inserted into `ValueIdToValueInfoMap`, ValueInfo is created or inserted using real GUID [2]. ValueInfo keeps a pointer to GlobalValueMap [3], using either `GUID` or `{GUID, Name}` [4] when reading per-module summaries to create a combined summary. [1] owned by per module-summary bitcode reader https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L947-L950 [2] [first](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7130-L7133), [second](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7221-L7222), [third](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7622-L7623) [3] https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1427-L1431 [4] https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1631 and https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1621 --------- Co-authored-by: Kazu Hirata <[email protected]>

The function returns an instance of FunctionSummary populated by calculateCallGraphRoot regardless of whether Edges is empty or not.

Fixes llvm#107739

…ive (llvm#107586) The previous assert was not considering programs with semantic errors. Fixes llvm#107495 Fixes llvm#93437

Fixes llvm#54773

This is an extension of CUDA Fortran. The iso_c_binding intrinsic can accept a `TYPE(c_devptr)` as its first argument. This patch relax the semantic check to accept it and update the lowering to unwrap the cptr field from the c_devptr.

DXC prefers dimension-preserving conversions over precision-losing conversions. This means a double4 -> float4 conversion is preferred over a double4 -> double3 or double4 -> double conversion.

This patch fixes: llvm/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h:214:5: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]

Implement support for HLSL intrinsic select. This would close issue llvm#75377

Partially addresses: llvm#89888

Relax the condition to ignore the case when count is 0. This fixes a bug on llvm@381e9d2. This was reported at https://discourse.llvm.org/t/coverage-from-multiple-test-executables/81024/.

Right now we're bailing out too early, and `-cuid` does not get set for the host-only compilations.

llvm#107862)

@kazutakahirata

…bal resolution in ELF" (llvm#107792) Fix the use-after-free bug and re-apply llvm#106193 * Without the fix, the string referenced by `objSym.Name` could be destroyed even if string saver keeps a copy of the referenced string. This caused use-after-free. * The fix ([latest commit](llvm@9776ed4)) updates `objSym.Name` to reference (via `StringRef`) the string saver's copy. Test: 1. For `lld/test/ELF/lto/asmundef.ll`, its test failure is reproducible with `-DLLVM_USE_SANITIZER=Address` and gone with the fix. 3. Run all tests by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes. * Without the fix, `ELF/lto/asmundef.ll` aborted the multi-stage test at `@@@BUILD_STEP stage2/asan_ubsan check@@@`, defined [here](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh#L30) * With the fix, the [multi-stage test](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh) pass stage2 {asan, ubsan, masan}. This is also the test used by https://lab.llvm.org/buildbot/#/builders/169 **Original commit message** `StringMap<T>` creates a [copy of the string](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMapEntry.h#L55-L58) for entry insertions and intentionally keep copies [since the implementation optimizes string memory usage](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMap.h#L124). On the other hand, linker keeps copies of symbol names [1] in `lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3]. This change proposes to optimize away string copies inside [LTO::GlobalResolutions](https://github.com/llvm/llvm-project/blob/24e791b4164986a1ca7776e3ae0292ef20d20c47/llvm/include/llvm/LTO/LTO.h#L409), which will make LTO indexing more memory efficient for ELF. There are similar opportunities for other (COFF, wasm, MachO) formats. The optimization takes place for lld (ELF) only. For the rest of use cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep copies and use global resolution key for de-duplication. Together with @kazutakahirata's work to make `ComputeCrossModuleImport` more memory efficient, we see a ~20% peak memory usage reduction in a binary where peak memory usage needs to go down. Thanks to the optimization in llvm@329ba52, the max (as opposed to the sum) of `ComputeCrossModuleImport` or `GlobalResolution` shows up in peak memory usage. * Regarding correctness, the set of [resolved](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/lib/LTO/LTO.cpp#L739) [per-module symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L188-L191) is a subset of [llvm::lto::InputFile::Symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L120). And bitcode symbol parsing saves symbol name when iterating `obj->symbols` in `BitcodeFile::parse` already. This change updates `BitcodeFile::parseLazy` to keep copies of per-module undefined symbols. * Presumably the undefined symbols in a LTO unit (copied in this patch in linker unique saver) is a small set compared with the set of symbols in global-resolution (copied before this patch), making this a worthwhile trade-off. Benchmarking this change alone shows measurable memory savings across various benchmarks. [1] ELF https://github.com/llvm/llvm-project/blob/1cea5c2138bef3d8fec75508df6dbb858e6e3560/lld/ELF/InputFiles.cpp#L1748 [2] https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2863 [3] https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2995

This patch adds caching of file attributes during directory iteration on Windows. This improves the performance when working with files being iterated on in a directory.

When this file was first contributed - `28b01c59c93d ([hexagon] Add {hvx,}hexagon_{protos,circ_brev...}, 2021-06-30)` - I incorrectly included a QuIC copyright statement with "All rights reserved". I should have contributed this file with the `Apache+LLVM exception` license.

Remove unused imports from python files in the MLGO library.

…d from different modules (llvm#104512) Summary: Because AST loading code is lazy and happens in unpredictable order it could happen that function and lambda inside function can be loaded from different modules. In this case, captured DeclRefExpr won’t match the corresponding VarDecl inside function. In AST it looks like this: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff changes AST deserialization to load lambdas inside canonical function declaration earlier right after the function to make sure that their canonical decl is loaded from the same module. Test Plan: check-clang

llvm#108037)

…vm#90074) llvm#108037 (llvm#108047) The previous `attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037` was incomplete because the `ImmutableModuleSummaryIndexWrapperPass` is now optional for the MachineOutliner pass. With this fix, the test file `CodeGen/AArch64/O3-pipeline.ll` shows no changes compared to its state before `[CGData][MachineOutliner] Global Outlining (llvm#90074)`. Co-authored-by: Kyungwoo Lee <[email protected]>

The previous `Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037 (llvm#108047)` somehow dropped this file.

Scalar FP calling convention has gotten more complicated with recent changes to Zfinx/Zdinx, proposed addition of a GPRF16 register class, and using customReg for f16/bf16 and other FP types small than XLen. The previous code tried to share a single getReg and getMem call for many different cases. This patch separates all the FP register handling to the top of the function with their own getReg calls. The only exception is f64 with XLen==32, when we are out of FPRs or not able to use FPRs due to ABI. The way I've structured this, we no longer need to correct the LocVT for FP back to ValVT before the call to getMem.

) Breaks bots, see llvm#105822. Reverts llvm#105822

…providers within lldb (llvm#102708) This PR adds a statistics provider cache, which allows an individual target to keep a rolling tally of it's total time and number of invocations for a given summary provider. This information is then available in statistics dump to help slow summary providers, and gleam more into insight into LLDB's time use.

Address the following issue: ``` ❯ ninja libc.test.src.__support.OSUtil.linux.vdso_test.__unit__ [91/127] Building CXX object libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o FAILED: libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -std=gnu++17 -fpie -DLIBC_FULL_BUILD -ffreestanding -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -MF libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o.d -o libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp:21: In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/UnitTest/ErrnoSetterMatcher.h:13: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/FPUtil/fpbits_str.h:12: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/CPP/string.h:20: /home/schrodingerzy/Documents/llvm-project/build/libc/include/stdlib.h:13:10: fatal error: 'llvm-libc-types/locale_t.h' file not found 13 | #include "llvm-libc-types/locale_t.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. [123/127] Building CXX object libc/test/UnitTest/CMakeFiles/LibcTest.unit.dir/LibcTestMain.cpp.o ninja: build stopped: subcommand failed. ```

…vm#107918) Sort the list of calls such that those with the same stack ids are also sorted by function. This allows processing of all matching calls (that can share a context node) in bulk as they are all adjacent. This has 2 benefits: 1. It reduces unnecessary work, specifically the handling to intersect the context ids with those along the graph edges for the stack ids, for calls that we know can share a node. 2. It simplifies detecting when we have matching stack ids but don't need to duplicate context ids. Specifically, we were previously still duplicating context ids whenever we saw another call with the same stack ids, but that isn't necessary if they will share a context node. With this change we now only duplicate context ids if we see some that not only have the same ids but also are in different functions. This change reduced the amount of context id duplication and provided reductions in both both peak memory (~8%) and time (~%5) for a large target.

llvm#107856) Similarly to operator<(), equality-comparing iterators from different ranges must really be forbidden. The preconditions for being able to do `it1 < it2` and `it1 != it2` (or `it1 == it2` for the matter) ought to be the same. Thus, there's little sense in keeping explicit base object comparison in operator==() whilst having this is a precondition in operator<() and operator-() (e.g. used for std::distance() and such).

The `@llvm.dx.typedBufferStore` intrinsic is lowered to `@dx.op.bufferStore`. Pull Request: llvm#104253

Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" Fix by checking opcode and value type before calling getOperand.

…#108066) This reverts commit b0d2411. Reverting because the original commit misses case of copysign from a constant.

The CoroSplit pass has it's own salvageDebugInfo implementation and it's DIExpressions do not get folded. Add a call to DIExpression::foldConstantMath in the CoroSplit pass to reduce the size of those DIExpressions. [The compile time tracker shows no significant increase in compile time either.](https://llvm-compile-time-tracker.com/compare.php?from=bdf02249e7f8f95177ff58c881caf219699acb98&to=e1c1c1759c06bc4c42f79eebdb0e3cd45219cef4&stat=instructions:u) rdar://134675402

This is newly used as of 0f52545. The bulk of the targets were added earlier in 9bb5556.

…v.s/d aliases. We were missing test coverage for fneg.d/fabs.d for Zdinx. When I added it revealed it only worked on RV64. The assembler was not creating a GPRPair register class on RV32 so the alias couldn't match. The disassembler was also not using GPRPair registers preventing the aliases from printing in disassembly too. I've fixed the assembler by adding new parsing methods in an attempt to get decent diagnostics. This is hard since the mnemonics are ambiguous between D and Zdinx. Tests have been adjusted for some differences in what errors are reported first.

…ers (llvm#105905) Refactoring `stackTrace` to perform frame look ups in a more on-demand fashion to improve overall performance. Additionally adding additional information to the `exceptionInfo` request to report exception stacks there instead of merging the exception stack into the stack trace. The `exceptionInfo` request is only called if a stop event occurs with `reason='exception'`, which should mitigate the performance of `SBThread::GetCurrentException` calls. Adding unit tests for exception handling and stack trace supporting.

Adds target codegen info class for DirectX. For now it always translates `__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)` (`RWBuffer<int>`). More work is needed to determine the actual target exp type and parameters based on the resource handle attributes. Part 1/2 of llvm#95952

* Move code related to spilling into SpillUtils to help cleanup CoroFrame See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

* Split buildCoroutineFrame into code related to normalization and code related to actually building the coroutine frame. * This will enable future specialization of buildCoroutineFrame for different ABIs.

When SPARC Asan testing is enabled by PR llvm#107405, many Linux/sparc64 tests just hang like ``` #0 0xf7ae8e90 in syscall () from /usr/lib32/libc.so.6 #1 0x701065e8 in __sanitizer::FutexWait(__sanitizer::atomic_uint32_t*, unsigned int) () at compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:766 #2 0x70107c90 in Wait () at compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:35 #3 0x700f7cac in Lock () at compiler-rt/lib/asan/../sanitizer_common/sanitizer_mutex.h:196 #4 Lock () at compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_registry.h:98 #5 LockThreads () at compiler-rt/lib/asan/asan_thread.cpp:489 #6 0x700e9c8c in __asan::BeforeFork() () at compiler-rt/lib/asan/asan_posix.cpp:157 #7 0xf7ac83f4 in ?? () from /usr/lib32/libc.so.6 Backtrace stopped: previous frame identical to this frame (corrupt stack?) ``` It turns out that this happens in tests using `internal_fork` (e.g. invoking `llvm-symbolizer`): unlike most other Linux targets, which use `clone`, Linux/sparc64 has to use `__fork` instead. While `clone` doesn't trigger `pthread_atfork` handlers, `__fork` obviously does, causing the hang. To avoid this, this patch disables `InstallAtForkHandler` and lets the ASan tests run to completion. Tested on `sparc64-unknown-linux-gnu`.

…ap (llvm#108825) This attempts to improve user-experience when LLDB stops on a verbose_trap. Currently if a `__builtin_verbose_trap` triggers, we display the first frame above the call to the verbose_trap. So in the newly added test case, we would've previously stopped here: ``` (lldb) run Process 28095 launched: '/Users/michaelbuch/a.out' (arm64) Process 28095 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access frame #1: 0x0000000100003f5c a.out`std::__1::vector<int>::operator[](this=0x000000016fdfebef size=0, (null)=10) at verbose_trap.cpp:6:9 3 template <typename T> 4 struct vector { 5 void operator[](unsigned) { -> 6 __builtin_verbose_trap("Bounds error", "out-of-bounds access"); 7 } 8 }; ``` After this patch, we would stop in the first non-`std` frame: ``` (lldb) run Process 27843 launched: '/Users/michaelbuch/a.out' (arm64) Process 27843 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access frame #2: 0x0000000100003f44 a.out`g() at verbose_trap.cpp:14:5 11 12 void g() { 13 std::vector<int> v; -> 14 v[10]; 15 } 16 ``` rdar://134490328

…ext is not fully initialized (llvm#110481) As this comment around target initialization implies: ``` // This can be NULL if we don't know anything about the architecture or if // the target for an architecture isn't enabled in the llvm/clang that we // built ``` There are cases where we might fail to call `InitBuiltinTypes` when creating the backing `ASTContext` for a `TypeSystemClang`. If that happens, the builtins `QualType`s, e.g., `VoidPtrTy`/`IntTy`/etc., are not initialized and dereferencing them as we do in `GetBuiltinTypeForEncodingAndBitSize` (and other places) will lead to nullptr-dereferences. Example backtrace: ``` (lldb) run Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 958. Process 2680 stopped * thread llvm#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + liblldb.20.0.0git.dylib`DWARFASTParserClang::ParseObjCMethod(lldb_private::ObjCLanguage::MethodName const&, lldb_private::plugin::dwarf::DWARFDIE const&, lldb_private::CompilerType, ParsedDWARFTypeAttributes , bool) (.cold.1): -> 0x10cdf3cdc <+0>: stp x29, x30, [sp, #-0x10]! 0x10cdf3ce0 <+4>: mov x29, sp 0x10cdf3ce4 <+8>: adrp x0, 545 0x10cdf3ce8 <+12>: add x0, x0, #0xa25 ; "ParseObjCMethod" Target 0: (lldb) stopped. (lldb) bt * thread llvm#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert frame #0: 0x0000000180d08600 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x0000000180d40f50 libsystem_pthread.dylib`pthread_kill + 288 frame #2: 0x0000000180c4d908 libsystem_c.dylib`abort + 128 frame #3: 0x0000000180c4cc1c libsystem_c.dylib`__assert_rtn + 284 * frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + frame #5: 0x0000000109d30acc liblldb.20.0.0git.dylib`lldb_private::TypeSystemClang::GetBuiltinTypeForEncodingAndBitSize(lldb::Encoding, unsigned long) + 1188 frame #6: 0x0000000109aaaed4 liblldb.20.0.0git.dylib`DynamicLoaderMacOS::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 384 ``` This patch adds a one-time user-visible warning for when we fail to initialize the AST to indicate that initialization went wrong for the given target. Additionally, we add checks for whether one of the `ASTContext` `QualType`s is invalid before dereferencing any builtin types. The warning would look as follows: ``` (lldb) target create "a.out" Current executable set to 'a.out' (arm64). (lldb) b main warning: Failed to initialize builtin ASTContext types for target 'some-unknown-triple'. Printing variables may behave unexpectedly. Breakpoint 1: where = a.out`main + 8 at stepping.cpp:5:14, address = 0x0000000100003f90 ``` rdar://134869779

Fixes llvm#102703. https://godbolt.org/z/nfj8xsb1Y The following pattern: ``` %2 = and i32 %0, 254 %3 = icmp eq i32 %2, 0 ``` is optimised by instcombine into: ```%3 = icmp ult i32 %0, 2``` However, post instcombine leads to worse aarch64 than the unoptimised version. Pre instcombine: ``` tst w0, #0xfe cset w0, eq ret ``` Post instcombine: ``` and w8, w0, #0xff cmp w8, #2 cset w0, lo ret ``` In the unoptimised version, SelectionDAG converts `SETCC (AND X 254) 0 EQ` into `CSEL 0 1 1 (ANDS X 254)`, which gets emitted as a `tst`. In the optimised version, SelectionDAG converts `SETCC (AND X 255) 2 ULT` into `CSEL 0 1 2 (SUBS (AND X 255) 2)`, which gets emitted as an `and`/`cmp`. This PR adds an optimisation to `AArch64ISelLowering`, converting `SETCC (AND X Y) Z ULT` into `SETCC (AND X (Y & ~(Z - 1))) 0 EQ` when `Z` is a power of two. This makes SelectionDAG/Codegen produce the same optimised code for both examples.

…1409)" This reverts commit a89e016. This is being reverted because it broke the test: Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input CHECK: frame #2: {{.*}}`main

TylerNowicki mentioned this pull request Sep 3, 2024

[Coroutines] Move the SuspendCrossingInfo analysis helper into its own header/source llvm/llvm-project#106306

Merged

TylerNowicki force-pushed the users/tylernowicki/coro-refactor2 branch from 4ccc804 to ddd817e Compare September 3, 2024 21:18

TylerNowicki force-pushed the users/tylernowicki/coro-refactor3 branch from e33a727 to b81ec81 Compare September 3, 2024 21:24

TylerNowicki force-pushed the users/tylernowicki/coro-refactor2 branch from ddd817e to ac80795 Compare September 4, 2024 16:07

TylerNowicki force-pushed the users/tylernowicki/coro-refactor3 branch from b81ec81 to d71a039 Compare September 4, 2024 16:18

TylerNowicki force-pushed the users/tylernowicki/coro-refactor2 branch 2 times, most recently from a615f91 to 20a99fb Compare September 9, 2024 16:12

pskrgag and others added 21 commits September 9, 2024 18:12

[gn build] Port ea2da57

95753ff

[LTO] Simplify calculateCallGraphRoot (NFC) (llvm#107765)

c36c462

The function returns an instance of FunctionSummary populated by calculateCallGraphRoot regardless of whether Edges is empty or not.

[ARM] Add a default unreachable case to AddrModeToString. NFC

95831f0

Fixes llvm#107739

[flang][OpenMP] Don't abort when default is used on an invalid direct…

7f90479

…ive (llvm#107586) The previous assert was not considering programs with semantic errors. Fixes llvm#107495 Fixes llvm#93437

[llvm-ml] Fix RIP-relative addressing for ptr operands (llvm#107618)

7543d09

Fixes llvm#54773

Fix implicit conversion rank ordering (llvm#106811)

6cc0138

DXC prefers dimension-preserving conversions over precision-losing conversions. This means a double4 -> float4 conversion is preferred over a double4 -> double3 or double4 -> double conversion.

[ARM] Fix a warning

34e3007

This patch fixes: llvm/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h:214:5: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]

[HLSL] Implement support for HLSL intrinsic - select (llvm#107129)

0f349b7

Implement support for HLSL intrinsic select. This would close issue llvm#75377

[flang][Driver] Support -Xlinker in flang (llvm#107472)

5f74671

Partially addresses: llvm#89888

[Coverage] Ignore unused functions if the count is 0. (llvm#107661)

6850410

Relax the condition to ignore the case when count is 0. This fixes a bug on llvm@381e9d2. This was reported at https://discourse.llvm.org/t/coverage-from-multiple-test-executables/81024/.

[CUDA/HIP] propagate -cuid to a host-only compilation. (llvm#107483)

4a501a4

Right now we're bailing out too early, and `-cuid` does not get set for the host-only compilations.

[libc][bazel] Update bazel overlay for math functions and their tests. (

2773719

llvm#107862)

[libc++] Cache file attributes during directory iteration (llvm#93316)

b1b9b7b

This patch adds caching of file attributes during directory iteration on Windows. This improves the performance when working with files being iterated on in a directory.

[MLGO] Remove unused imports

02fff93

Remove unused imports from python files in the MLGO library.

dmpolukhin and others added 21 commits September 10, 2024 16:15

Attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) (

bf68403

llvm#108037)

Fix for llvm/test/CodeGen/RISCV/O3-pipeline.ll (llvm#108050)

2cfdcfb

The previous `Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037 (llvm#108047)` somehow dropped this file.

Revert "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (llvm#108054

c7a7767

) Breaks bots, see llvm#105822. Reverts llvm#105822

[DirectX] Lower @llvm.dx.typedBufferStore to DXIL ops

90e8411

The `@llvm.dx.typedBufferStore` intrinsic is lowered to `@dx.op.bufferStore`. Pull Request: llvm#104253

[WebAssembly] Add a colon to type error message (llvm#107980)

c8ed2b8

[PowerPC] Fix assert exposed by PR 95931 in LowerBITCAST (llvm#108062)

22067a8

Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" Fix by checking opcode and value type before calling getOperand.

Revert "[NVPTX] Support copysign PTX instruction (llvm#107800)" (llvm…

02c943a

…#108066) This reverts commit b0d2411. Reverting because the original commit misses case of copysign from a constant.

[bazel] Add CGData targets/deps (llvm#108070)

feeb6aa

This is newly used as of 0f52545. The bulk of the targets were added earlier in 9bb5556.

[Coroutines] Move spill related methods to a Spill utils (llvm#107884)

f4e2d7b

* Move code related to spilling into SpillUtils to help cleanup CoroFrame See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

[Coroutines] Split buildCoroutineFrame

34559ad

* Split buildCoroutineFrame into code related to normalization and code related to actually building the coroutine frame. * This will enable future specialization of buildCoroutineFrame for different ABIs.

TylerNowicki force-pushed the users/tylernowicki/coro-refactor3 branch from d71a039 to 34559ad Compare September 10, 2024 19:47

TylerNowicki deleted the branch users/tylernowicki/coro-refactor2 September 10, 2024 19:48

TylerNowicki closed this Sep 10, 2024

TylerNowicki deleted the users/tylernowicki/coro-refactor3 branch September 11, 2024 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2

[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2

TylerNowicki commented Sep 3, 2024 •

edited

Loading

[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2

[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2

Conversation

TylerNowicki commented Sep 3, 2024 • edited Loading

TylerNowicki commented Sep 3, 2024 •

edited

Loading