Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2

Conversation

TylerNowicki
Copy link
Owner

@TylerNowicki TylerNowicki commented Sep 3, 2024

  • Split buildCoroutineFrame into code related to normalization and code
    related to actually building the coroutine frame.
  • This will enable future specialization of buildCoroutineFrame for
    different ABIs.

See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor2 branch from 4ccc804 to ddd817e Compare September 3, 2024 21:18
@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor3 branch from e33a727 to b81ec81 Compare September 3, 2024 21:24
@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor2 branch from ddd817e to ac80795 Compare September 4, 2024 16:07
@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor3 branch from b81ec81 to d71a039 Compare September 4, 2024 16:18
TylerNowicki pushed a commit that referenced this pull request Sep 9, 2024
…llvm#94981)

This extends default argument deduction to cover class templates as
well, applying only to partial ordering, adding to the provisional
wording introduced in llvm#89807.

This solves some ambuguity introduced in P0522 regarding how template
template parameters are partially ordered, and should reduce the
negative impact of enabling `-frelaxed-template-template-args` by
default.

Given the following example:
```C++
template <class T1, class T2 = float> struct A;
template <class T3> struct B;

template <template <class T4> class TT1, class T5> struct B<TT1<T5>>;   // #1
template <class T6, class T7>                      struct B<A<T6, T7>>; // #2

template struct B<A<int>>;
```
Prior to P0522, `#2` was picked. Afterwards, this became ambiguous. This
patch restores the pre-P0522 behavior, `#2` is picked again.
… 'operator->' in the current instantiation (llvm#104458)

Currently, clang erroneously rejects the following:
```
struct A
{
    template<typename T>
    void f();
};

template<typename T>
struct B
{
    void g()
    {
        (*this)->template f<int>(); // error: no member named 'f' in 'B<T>'
    }

    A* operator->();
};
```

This happens because `Sema::ActOnStartCXXMemberReference` does not adjust the `ObjectType` parameter when `ObjectType` is a dependent type (except when the type is a `PointerType` and the class member access is the `->` form). Since the (possibly adjusted) `ObjectType` parameter (`B<T>` in the above example) is passed to `Parser::ParseOptionalCXXScopeSpecifier`, we end up looking up `f` in `B` rather than `A`. 

This patch fixes the issue by identifying cases where the type of the object expression `T` is a dependent, non-pointer type and:
- `T` is the current instantiation and lookup for `operator->` finds a member of the current instantiation, or
- `T` has at least one dependent base case, and `operator->` is not found in the current instantiation

and using `ASTContext::DependentTy` as the type of the object expression when the optional _nested-name-specifier_ is parsed.

Fixes llvm#104268.
@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor2 branch 2 times, most recently from a615f91 to 20a99fb Compare September 9, 2024 16:12
pskrgag and others added 21 commits September 9, 2024 18:12
…llvm#107572)

As reported in
llvm#103714 (comment).
CSA crashes on trying to bind value to symbolic region with `void *`.
This happens when such region gets passed as inline asm input and engine
tries to bind `UnknownVal` to that region.

Fix it by changing type from void to char before calling
`GetElementZeroRegion`
…lvm#107780)

In the case of `--wasm64` we were setting the type of the init expression
to be 64-bit but were only setting the low 32-bits of the value (by
assigning to Int32).

Fixes: emscripten-core/emscripten#22538
These packaged are imported by LLVMConfig.cmake and so we should be
passing through the necessary variables from the parent build into the
subbuilds.

We use `CMAKE_CACHE_DEFAULT_ARGS` so subbuilds can override these
variables if needed.
…in map (llvm#107735)

Currently, `ValueIdToValueInfoMap` [1] stores `std::tuple<ValueInfo,
GlobalValue::GUID /* original GUID */, GlobalValue::GUID /* real GUID*/
>`. This change updates the stored value type to `std::pair<ValueInfo,
GlobalValue::GUID /* original GUID */>`, and reads real GUID from
ValueInfo.

When an entry is inserted into `ValueIdToValueInfoMap`, ValueInfo is
created or inserted using real GUID [2]. ValueInfo keeps a pointer to
GlobalValueMap [3], using either `GUID` or `{GUID, Name}` [4] when
reading per-module summaries to create a combined summary.

[1] owned by per module-summary bitcode reader
https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L947-L950
[2]
[first](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7130-L7133),
[second](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7221-L7222),
[third](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7622-L7623)
[3]
https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1427-L1431
[4]
https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1631
and
https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1621

---------

Co-authored-by: Kazu Hirata <[email protected]>
The function returns an instance of FunctionSummary populated by
calculateCallGraphRoot regardless of whether Edges is empty or not.
…ive (llvm#107586)

The previous assert was not considering programs with semantic errors.

Fixes llvm#107495
Fixes llvm#93437
This is an extension of CUDA Fortran. The iso_c_binding intrinsic can
accept a `TYPE(c_devptr)` as its first argument. This patch relax the
semantic check to accept it and update the lowering to unwrap the cptr
field from the c_devptr.
DXC prefers dimension-preserving conversions over precision-losing
conversions. This means a double4 -> float4 conversion is preferred over
a double4 -> double3 or double4 -> double conversion.
This patch fixes:

  llvm/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h:214:5: error: default
  label in switch which covers all enumeration values
  [-Werror,-Wcovered-switch-default]
Implement support for HLSL intrinsic select.
This would close issue llvm#75377
Right now we're bailing out too early, and `-cuid` does not get set for
the host-only compilations.
…bal resolution in ELF" (llvm#107792)

Fix the use-after-free bug and re-apply
llvm#106193
* Without the fix, the string referenced by `objSym.Name` could be
destroyed even if string saver keeps a copy of the referenced string.
This caused use-after-free.
* The fix ([latest
commit](llvm@9776ed4))
updates `objSym.Name` to reference (via `StringRef`) the string saver's
copy.

Test:
1. For `lld/test/ELF/lto/asmundef.ll`, its test failure is reproducible
with `-DLLVM_USE_SANITIZER=Address` and gone with the fix.
3. Run all tests by following
https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes.
* Without the fix, `ELF/lto/asmundef.ll` aborted the multi-stage test at
`@@@BUILD_STEP stage2/asan_ubsan check@@@`, defined
[here](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh#L30)
* With the fix, the [multi-stage
test](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh)
pass stage2 {asan, ubsan, masan}. This is also the test used by
https://lab.llvm.org/buildbot/#/builders/169


**Original commit message**

`StringMap<T>` creates a [copy of the
string](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMapEntry.h#L55-L58)
for entry insertions and intentionally keep copies [since the
implementation optimizes string memory
usage](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMap.h#L124).
On the other hand, linker keeps copies of symbol names [1] in
`lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3].

This change proposes to optimize away string copies inside
[LTO::GlobalResolutions](https://github.com/llvm/llvm-project/blob/24e791b4164986a1ca7776e3ae0292ef20d20c47/llvm/include/llvm/LTO/LTO.h#L409),
which will make LTO indexing more memory efficient for ELF. There are
similar opportunities for other (COFF, wasm, MachO) formats.

The optimization takes place for lld (ELF) only. For the rest of use
cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep
copies and use global resolution key for de-duplication.

Together with @kazutakahirata's work to make `ComputeCrossModuleImport`
more memory efficient, we see a ~20% peak memory usage reduction in a
binary where peak memory usage needs to go down. Thanks to the
optimization in
llvm@329ba52,
the max (as opposed to the sum) of `ComputeCrossModuleImport` or
`GlobalResolution` shows up in peak memory usage.
* Regarding correctness, the set of
[resolved](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/lib/LTO/LTO.cpp#L739)
[per-module
symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L188-L191)
is a subset of
[llvm::lto::InputFile::Symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L120).
And bitcode symbol parsing saves symbol name when iterating
`obj->symbols` in `BitcodeFile::parse` already. This change updates
`BitcodeFile::parseLazy` to keep copies of per-module undefined symbols.
* Presumably the undefined symbols in a LTO unit (copied in this patch
in linker unique saver) is a small set compared with the set of symbols
in global-resolution (copied before this patch), making this a
worthwhile trade-off. Benchmarking this change alone shows measurable
memory savings across various benchmarks.

[1] ELF
https://github.com/llvm/llvm-project/blob/1cea5c2138bef3d8fec75508df6dbb858e6e3560/lld/ELF/InputFiles.cpp#L1748
[2]
https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2863
[3]
https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2995
This patch adds caching of file attributes during directory iteration
on Windows. This improves the performance when working with files being
iterated on in a directory.
When this file was first contributed - `28b01c59c93d ([hexagon] Add
{hvx,}hexagon_{protos,circ_brev...}, 2021-06-30)` - I incorrectly
included a QuIC copyright statement with "All rights reserved". I should
have contributed this file with the `Apache+LLVM exception` license.
Remove unused imports from python files in the MLGO library.
dmpolukhin and others added 21 commits September 10, 2024 16:15
…d from different modules (llvm#104512)

Summary:
Because AST loading code is lazy and happens in unpredictable order it
could happen that function and lambda inside function can be loaded from
different modules. In this case, captured DeclRefExpr won’t match the
corresponding VarDecl inside function. In AST it looks like this:
```
FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline
|-also in ./folly-conv.h
`-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1>
  |-DeclStmt 0x555564f7ced8 <line:34:3, col:17>
  | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit
  |   `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0
  |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>'
  | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0
  | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing
  | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)'
  |   |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition
  |   | |-also in ./thrift_cpp2_base.h
  |   | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init
  |   |   |-DefaultConstructor defaulted_is_constexpr
  |   |   |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveConstructor exists simple trivial needs_implicit
  |   |   |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
  |   |   |-MoveAssignment
  |   |   `-Destructor simple irrelevant trivial constexpr needs_implicit
  |   `-CompoundStmt 0x555564f7d1a8 <col:58, col:75>
  |     `-ReturnStmt 0x555564f7d198 <col:60, col:67>
  |       `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture
  `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11>
    `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void'
```
This diff changes AST deserialization to load lambdas inside canonical
function declaration earlier right after the function to make sure that
their canonical decl is loaded from the same module.

Test Plan: check-clang
…vm#90074) llvm#108037 (llvm#108047)

The previous `attempt to fix [CGData][MachineOutliner] Global Outlining
(llvm#90074) llvm#108037` was incomplete because the
`ImmutableModuleSummaryIndexWrapperPass` is now optional for the
MachineOutliner pass.

With this fix, the test file `CodeGen/AArch64/O3-pipeline.ll` shows no
changes compared to its state before `[CGData][MachineOutliner] Global
Outlining (llvm#90074)`.

Co-authored-by: Kyungwoo Lee <[email protected]>
The previous `Fix for Attempt to fix [CGData][MachineOutliner] Global
Outlining (llvm#90074) llvm#108037 (llvm#108047)` somehow dropped this file.
Scalar FP calling convention has gotten more complicated with recent
changes to Zfinx/Zdinx, proposed addition of a GPRF16 register class,
and using customReg for f16/bf16 and other FP types small than XLen.

The previous code tried to share a single getReg and getMem call for
many different cases. This patch separates all the FP register handling
to the top of the function with their own getReg calls. The only
exception is f64 with XLen==32, when we are out of FPRs or not able to
use FPRs due to ABI.

The way I've structured this, we no longer need to correct the LocVT for
FP back to ValVT before the call to getMem.
…providers within lldb (llvm#102708)

This PR adds a statistics provider cache, which allows an individual
target to keep a rolling tally of it's total time and number of
invocations for a given summary provider. This information is then
available in statistics dump to help slow summary providers, and gleam
more into insight into LLDB's time use.
Address the following issue:
```
❯ ninja libc.test.src.__support.OSUtil.linux.vdso_test.__unit__
[91/127] Building CXX object libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o
FAILED: libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o 
sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -std=gnu++17 -fpie -DLIBC_FULL_BUILD -ffreestanding -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -MF libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o.d -o libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp
In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp:21:
In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/UnitTest/ErrnoSetterMatcher.h:13:
In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/FPUtil/fpbits_str.h:12:
In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/CPP/string.h:20:
/home/schrodingerzy/Documents/llvm-project/build/libc/include/stdlib.h:13:10: fatal error: 'llvm-libc-types/locale_t.h' file not found
   13 | #include "llvm-libc-types/locale_t.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
[123/127] Building CXX object libc/test/UnitTest/CMakeFiles/LibcTest.unit.dir/LibcTestMain.cpp.o
ninja: build stopped: subcommand failed.
```
…vm#107918)

Sort the list of calls such that those with the same stack ids are also
sorted by function. This allows processing of all matching calls (that
can share a context node) in bulk as they are all adjacent.

This has 2 benefits:
1. It reduces unnecessary work, specifically the handling to intersect
   the context ids with those along the graph edges for the stack ids,
   for calls that we know can share a node.
2. It simplifies detecting when we have matching stack ids but don't
   need to duplicate context ids. Specifically, we were previously
   still duplicating context ids whenever we saw another call with the
   same stack ids, but that isn't necessary if they will share a context
   node. With this change we now only duplicate context ids if we see
   some that not only have the same ids but also are in different
   functions.

This change reduced the amount of context id duplication and provided
reductions in both both peak memory (~8%) and time (~%5) for a large
target.
llvm#107856)

Similarly to operator<(), equality-comparing iterators from different
ranges must really be forbidden. The preconditions for being able to do
`it1 < it2` and `it1 != it2` (or `it1 == it2` for the matter) ought to
be the same. Thus, there's little sense in keeping explicit base object
comparison in operator==() whilst having this is a precondition in
operator<() and operator-() (e.g. used for std::distance() and such).
The `@llvm.dx.typedBufferStore` intrinsic is lowered to `@dx.op.bufferStore`.

Pull Request: llvm#104253
Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" 
Fix by checking opcode and value type before calling getOperand.
…#108066)

This reverts commit b0d2411.

Reverting because the original commit misses case of copysign from a
constant.
The CoroSplit pass has it's own salvageDebugInfo implementation and it's
DIExpressions do not get folded. Add a call to
DIExpression::foldConstantMath in the CoroSplit pass to reduce the size
of those DIExpressions.

[The compile time tracker shows no significant increase in compile time
either.](https://llvm-compile-time-tracker.com/compare.php?from=bdf02249e7f8f95177ff58c881caf219699acb98&to=e1c1c1759c06bc4c42f79eebdb0e3cd45219cef4&stat=instructions:u)

rdar://134675402
This is newly used as of 0f52545.

The bulk of the targets were added earlier in
9bb5556.
…v.s/d aliases.

We were missing test coverage for fneg.d/fabs.d for Zdinx. When I
added it revealed it only worked on RV64. The assembler was not
creating a GPRPair register class on RV32 so the alias couldn't match.
The disassembler was also not using GPRPair registers preventing the
aliases from printing in disassembly too.

I've fixed the assembler by adding new parsing methods in an attempt
to get decent diagnostics. This is hard since the mnemonics are
ambiguous between D and Zdinx. Tests have been adjusted for some
differences in what errors are reported first.
…ers (llvm#105905)

Refactoring `stackTrace` to perform frame look ups in a more on-demand
fashion to improve overall performance.

Additionally adding additional information to the `exceptionInfo`
request to report exception stacks there instead of merging the
exception stack into the stack trace. The `exceptionInfo` request is
only called if a stop event occurs with `reason='exception'`, which
should mitigate the performance of `SBThread::GetCurrentException`
calls.

Adding unit tests for exception handling and stack trace supporting.
Adds target codegen info class for DirectX. For now it always translates
`__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)`
(`RWBuffer<int>`). More work is needed to determine the actual target
exp type and parameters based on the resource handle attributes.

Part 1/2 of llvm#95952
* Move code related to spilling into SpillUtils to help cleanup
CoroFrame

See RFC for more info:
https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
* Split buildCoroutineFrame into code related to normalization and code
  related to actually building the coroutine frame.
* This will enable future specialization of buildCoroutineFrame for
  different ABIs.
@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor3 branch from d71a039 to 34559ad Compare September 10, 2024 19:47
@TylerNowicki TylerNowicki deleted the branch users/tylernowicki/coro-refactor2 September 10, 2024 19:48
@TylerNowicki TylerNowicki deleted the users/tylernowicki/coro-refactor3 branch September 11, 2024 14:38
TylerNowicki pushed a commit that referenced this pull request Sep 17, 2024
When SPARC Asan testing is enabled by PR llvm#107405, many Linux/sparc64
tests just hang like
```
#0  0xf7ae8e90 in syscall () from /usr/lib32/libc.so.6
#1  0x701065e8 in __sanitizer::FutexWait(__sanitizer::atomic_uint32_t*, unsigned int) ()
    at compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp:766
#2  0x70107c90 in Wait ()
    at compiler-rt/lib/sanitizer_common/sanitizer_mutex.cpp:35
#3  0x700f7cac in Lock ()
    at compiler-rt/lib/asan/../sanitizer_common/sanitizer_mutex.h:196
#4  Lock ()
    at compiler-rt/lib/asan/../sanitizer_common/sanitizer_thread_registry.h:98
#5  LockThreads ()
    at compiler-rt/lib/asan/asan_thread.cpp:489
#6  0x700e9c8c in __asan::BeforeFork() ()
    at compiler-rt/lib/asan/asan_posix.cpp:157
#7  0xf7ac83f4 in ?? () from /usr/lib32/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
```
It turns out that this happens in tests using `internal_fork` (e.g.
invoking `llvm-symbolizer`): unlike most other Linux targets, which use
`clone`, Linux/sparc64 has to use `__fork` instead. While `clone`
doesn't trigger `pthread_atfork` handlers, `__fork` obviously does,
causing the hang.

To avoid this, this patch disables `InstallAtForkHandler` and lets the
ASan tests run to completion.

Tested on `sparc64-unknown-linux-gnu`.
TylerNowicki pushed a commit that referenced this pull request Sep 19, 2024
…ap (llvm#108825)

This attempts to improve user-experience when LLDB stops on a
verbose_trap. Currently if a `__builtin_verbose_trap` triggers, we
display the first frame above the call to the verbose_trap. So in the
newly added test case, we would've previously stopped here:
```
(lldb) run
Process 28095 launched: '/Users/michaelbuch/a.out' (arm64)
Process 28095 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access
    frame #1: 0x0000000100003f5c a.out`std::__1::vector<int>::operator[](this=0x000000016fdfebef size=0, (null)=10) at verbose_trap.cpp:6:9
   3    template <typename T>
   4    struct vector {
   5        void operator[](unsigned) {
-> 6            __builtin_verbose_trap("Bounds error", "out-of-bounds access");
   7        }
   8    };
```

After this patch, we would stop in the first non-`std` frame:
```
(lldb) run
Process 27843 launched: '/Users/michaelbuch/a.out' (arm64)
Process 27843 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = Bounds error: out-of-bounds access
    frame #2: 0x0000000100003f44 a.out`g() at verbose_trap.cpp:14:5
   11  
   12   void g() {
   13       std::vector<int> v;
-> 14       v[10];
   15   }
   16  
```

rdar://134490328
TylerNowicki pushed a commit that referenced this pull request Oct 3, 2024
…ext is not fully initialized (llvm#110481)

As this comment around target initialization implies:
```
  // This can be NULL if we don't know anything about the architecture or if
  // the target for an architecture isn't enabled in the llvm/clang that we
  // built
```

There are cases where we might fail to call `InitBuiltinTypes` when
creating the backing `ASTContext` for a `TypeSystemClang`. If that
happens, the builtins `QualType`s, e.g., `VoidPtrTy`/`IntTy`/etc., are
not initialized and dereferencing them as we do in
`GetBuiltinTypeForEncodingAndBitSize` (and other places) will lead to
nullptr-dereferences. Example backtrace:
```
(lldb) run
Assertion failed: (!isNull() && "Cannot retrieve a NULL type pointer"), function getCommonPtr, file Type.h, line 958.
Process 2680 stopped
* thread llvm#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
    frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + 
liblldb.20.0.0git.dylib`DWARFASTParserClang::ParseObjCMethod(lldb_private::ObjCLanguage::MethodName const&, lldb_private::plugin::dwarf::DWARFDIE const&, lldb_private::CompilerType, ParsedDWARFTypeAttributes
, bool) (.cold.1):
->  0x10cdf3cdc <+0>:  stp    x29, x30, [sp, #-0x10]!
    0x10cdf3ce0 <+4>:  mov    x29, sp
    0x10cdf3ce4 <+8>:  adrp   x0, 545
    0x10cdf3ce8 <+12>: add    x0, x0, #0xa25 ; "ParseObjCMethod"
Target 0: (lldb) stopped.
(lldb) bt
* thread llvm#15, name = '<lldb.process.internal-state(pid=2712)>', stop reason = hit program assert
    frame #0: 0x0000000180d08600 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000180d40f50 libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x0000000180c4d908 libsystem_c.dylib`abort + 128
    frame #3: 0x0000000180c4cc1c libsystem_c.dylib`__assert_rtn + 284
  * frame #4: 0x000000010cdf3cdc liblldb.20.0.0git.dylib`DWARFASTParserClang::ExtractIntFromFormValue(lldb_private::CompilerType const&, lldb_private::plugin::dwarf::DWARFFormValue const&) const (.cold.1) + 
    frame #5: 0x0000000109d30acc liblldb.20.0.0git.dylib`lldb_private::TypeSystemClang::GetBuiltinTypeForEncodingAndBitSize(lldb::Encoding, unsigned long) + 1188
    frame #6: 0x0000000109aaaed4 liblldb.20.0.0git.dylib`DynamicLoaderMacOS::NotifyBreakpointHit(void*, lldb_private::StoppointCallbackContext*, unsigned long long, unsigned long long) + 384
```

This patch adds a one-time user-visible warning for when we fail to
initialize the AST to indicate that initialization went wrong for the
given target. Additionally, we add checks for whether one of the
`ASTContext` `QualType`s is invalid before dereferencing any builtin
types.

The warning would look as follows:
```
(lldb) target create "a.out"
Current executable set to 'a.out' (arm64).
(lldb) b main
warning: Failed to initialize builtin ASTContext types for target 'some-unknown-triple'. Printing variables may behave unexpectedly.
Breakpoint 1: where = a.out`main + 8 at stepping.cpp:5:14, address = 0x0000000100003f90
```

rdar://134869779
TylerNowicki pushed a commit that referenced this pull request Oct 8, 2024
Fixes llvm#102703.

https://godbolt.org/z/nfj8xsb1Y

The following pattern:

```
%2 = and i32 %0, 254
%3 = icmp eq i32 %2, 0
```
is optimised by instcombine into:

```%3 = icmp ult i32 %0, 2```

However, post instcombine leads to worse aarch64 than the unoptimised version.

Pre instcombine:
```
        tst     w0, #0xfe
        cset    w0, eq
        ret
```
Post instcombine:
```
        and     w8, w0, #0xff
        cmp     w8, #2
        cset    w0, lo
        ret
```


In the unoptimised version, SelectionDAG converts `SETCC (AND X 254) 0 EQ` into `CSEL 0 1 1 (ANDS X 254)`, which gets emitted as a `tst`.

In the optimised version, SelectionDAG converts `SETCC (AND X 255) 2 ULT` into `CSEL 0 1 2 (SUBS (AND X 255) 2)`, which gets emitted as an `and`/`cmp`.

This PR adds an optimisation to `AArch64ISelLowering`, converting `SETCC (AND X Y) Z ULT` into `SETCC (AND X (Y & ~(Z - 1))) 0 EQ` when `Z` is a power of two. This makes SelectionDAG/Codegen produce the same optimised code for both examples.
TylerNowicki pushed a commit that referenced this pull request Oct 15, 2024
…1409)"

This reverts commit a89e016.

This is being reverted because it broke the test:

Unwind/trap_frame_sym_ctx.test

/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
 CHECK: frame #2: {{.*}}`main
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.