Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2

Commits on Sep 9, 2024

  1. [Clang][Sema] Use the correct lookup context when building overloaded…

    … 'operator->' in the current instantiation (llvm#104458)
    
    Currently, clang erroneously rejects the following:
    ```
    struct A
    {
        template<typename T>
        void f();
    };
    
    template<typename T>
    struct B
    {
        void g()
        {
            (*this)->template f<int>(); // error: no member named 'f' in 'B<T>'
        }
    
        A* operator->();
    };
    ```
    
    This happens because `Sema::ActOnStartCXXMemberReference` does not adjust the `ObjectType` parameter when `ObjectType` is a dependent type (except when the type is a `PointerType` and the class member access is the `->` form). Since the (possibly adjusted) `ObjectType` parameter (`B<T>` in the above example) is passed to `Parser::ParseOptionalCXXScopeSpecifier`, we end up looking up `f` in `B` rather than `A`. 
    
    This patch fixes the issue by identifying cases where the type of the object expression `T` is a dependent, non-pointer type and:
    - `T` is the current instantiation and lookup for `operator->` finds a member of the current instantiation, or
    - `T` has at least one dependent base case, and `operator->` is not found in the current instantiation
    
    and using `ASTContext::DependentTy` as the type of the object expression when the optional _nested-name-specifier_ is parsed.
    
    Fixes llvm#104268.
    sdkrystian authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    3cdb30e View commit details
    Browse the repository at this point in the history
  2. [analyzer] fix crash on binding to symbolic region with void * type (

    …llvm#107572)
    
    As reported in
    llvm#103714 (comment).
    CSA crashes on trying to bind value to symbolic region with `void *`.
    This happens when such region gets passed as inline asm input and engine
    tries to bind `UnknownVal` to that region.
    
    Fix it by changing type from void to char before calling
    `GetElementZeroRegion`
    pskrgag authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    db6051d View commit details
    Browse the repository at this point in the history
  3. [gn build] Port ea2da57

    llvmgnsyncbot committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    95753ff View commit details
    Browse the repository at this point in the history
  4. [lld][WebAssembly] Fix use of uninitialized stack data with --wasm64 (l…

    …lvm#107780)
    
    In the case of `--wasm64` we were setting the type of the init expression
    to be 64-bit but were only setting the low 32-bits of the value (by
    assigning to Int32).
    
    Fixes: emscripten-core/emscripten#22538
    sbc100 authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    5c8fd1e View commit details
    Browse the repository at this point in the history
  5. [CMake] Passthrough variables for packages to subbuilds (llvm#107611)

    These packaged are imported by LLVMConfig.cmake and so we should be
    passing through the necessary variables from the parent build into the
    subbuilds.
    
    We use `CMAKE_CACHE_DEFAULT_ARGS` so subbuilds can override these
    variables if needed.
    petrhosek authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    60f052e View commit details
    Browse the repository at this point in the history
  6. [NFCI][BitcodeReader]Read real GUID from VI as opposed to storing it …

    …in map (llvm#107735)
    
    Currently, `ValueIdToValueInfoMap` [1] stores `std::tuple<ValueInfo,
    GlobalValue::GUID /* original GUID */, GlobalValue::GUID /* real GUID*/
    >`. This change updates the stored value type to `std::pair<ValueInfo,
    GlobalValue::GUID /* original GUID */>`, and reads real GUID from
    ValueInfo.
    
    When an entry is inserted into `ValueIdToValueInfoMap`, ValueInfo is
    created or inserted using real GUID [2]. ValueInfo keeps a pointer to
    GlobalValueMap [3], using either `GUID` or `{GUID, Name}` [4] when
    reading per-module summaries to create a combined summary.
    
    [1] owned by per module-summary bitcode reader
    https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L947-L950
    [2]
    [first](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7130-L7133),
    [second](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7221-L7222),
    [third](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7622-L7623)
    [3]
    https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1427-L1431
    [4]
    https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1631
    and
    https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1621
    
    ---------
    
    Co-authored-by: Kazu Hirata <[email protected]>
    minglotus-6 and kazutakahirata authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    7d37172 View commit details
    Browse the repository at this point in the history
  7. [LTO] Simplify calculateCallGraphRoot (NFC) (llvm#107765)

    The function returns an instance of FunctionSummary populated by
    calculateCallGraphRoot regardless of whether Edges is empty or not.
    kazutakahirata authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    c36c462 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    95831f0 View commit details
    Browse the repository at this point in the history
  9. [flang][OpenMP] Don't abort when default is used on an invalid direct…

    …ive (llvm#107586)
    
    The previous assert was not considering programs with semantic errors.
    
    Fixes llvm#107495
    Fixes llvm#93437
    luporl authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    7f90479 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    7543d09 View commit details
    Browse the repository at this point in the history
  11. [flang][cuda] Support c_devptr in c_f_pointer intrinsic (llvm#107470)

    This is an extension of CUDA Fortran. The iso_c_binding intrinsic can
    accept a `TYPE(c_devptr)` as its first argument. This patch relax the
    semantic check to accept it and update the lowering to unwrap the cptr
    field from the c_devptr.
    clementval authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    cd8229b View commit details
    Browse the repository at this point in the history
  12. Fix implicit conversion rank ordering (llvm#106811)

    DXC prefers dimension-preserving conversions over precision-losing
    conversions. This means a double4 -> float4 conversion is preferred over
    a double4 -> double3 or double4 -> double conversion.
    llvm-beanz authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    6cc0138 View commit details
    Browse the repository at this point in the history
  13. [ARM] Fix a warning

    This patch fixes:
    
      llvm/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h:214:5: error: default
      label in switch which covers all enumeration values
      [-Werror,-Wcovered-switch-default]
    kazutakahirata committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    34e3007 View commit details
    Browse the repository at this point in the history
  14. [HLSL] Implement support for HLSL intrinsic - select (llvm#107129)

    Implement support for HLSL intrinsic select.
    This would close issue llvm#75377
    spall authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    0f349b7 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    5f74671 View commit details
    Browse the repository at this point in the history
  16. [Coverage] Ignore unused functions if the count is 0. (llvm#107661)

    Relax the condition to ignore the case when count is 0. 
    
    This fixes a bug on
    llvm@381e9d2.
    This was reported at
    https://discourse.llvm.org/t/coverage-from-multiple-test-executables/81024/.
    ZequanWu authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    6850410 View commit details
    Browse the repository at this point in the history
  17. [CUDA/HIP] propagate -cuid to a host-only compilation. (llvm#107483)

    Right now we're bailing out too early, and `-cuid` does not get set for
    the host-only compilations.
    Artem-B authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    4a501a4 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    2773719 View commit details
    Browse the repository at this point in the history
  19. Re-apply "[NFCI][LTO][lld] Optimize away symbol copies within LTO glo…

    …bal resolution in ELF" (llvm#107792)
    
    Fix the use-after-free bug and re-apply
    llvm#106193
    * Without the fix, the string referenced by `objSym.Name` could be
    destroyed even if string saver keeps a copy of the referenced string.
    This caused use-after-free.
    * The fix ([latest
    commit](llvm@9776ed4))
    updates `objSym.Name` to reference (via `StringRef`) the string saver's
    copy.
    
    Test:
    1. For `lld/test/ELF/lto/asmundef.ll`, its test failure is reproducible
    with `-DLLVM_USE_SANITIZER=Address` and gone with the fix.
    3. Run all tests by following
    https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes.
    * Without the fix, `ELF/lto/asmundef.ll` aborted the multi-stage test at
    `@@@BUILD_STEP stage2/asan_ubsan check@@@`, defined
    [here](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh#L30)
    * With the fix, the [multi-stage
    test](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh)
    pass stage2 {asan, ubsan, masan}. This is also the test used by
    https://lab.llvm.org/buildbot/#/builders/169
    
    
    **Original commit message**
    
    `StringMap<T>` creates a [copy of the
    string](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMapEntry.h#L55-L58)
    for entry insertions and intentionally keep copies [since the
    implementation optimizes string memory
    usage](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMap.h#L124).
    On the other hand, linker keeps copies of symbol names [1] in
    `lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3].
    
    This change proposes to optimize away string copies inside
    [LTO::GlobalResolutions](https://github.com/llvm/llvm-project/blob/24e791b4164986a1ca7776e3ae0292ef20d20c47/llvm/include/llvm/LTO/LTO.h#L409),
    which will make LTO indexing more memory efficient for ELF. There are
    similar opportunities for other (COFF, wasm, MachO) formats.
    
    The optimization takes place for lld (ELF) only. For the rest of use
    cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep
    copies and use global resolution key for de-duplication.
    
    Together with @kazutakahirata's work to make `ComputeCrossModuleImport`
    more memory efficient, we see a ~20% peak memory usage reduction in a
    binary where peak memory usage needs to go down. Thanks to the
    optimization in
    llvm@329ba52,
    the max (as opposed to the sum) of `ComputeCrossModuleImport` or
    `GlobalResolution` shows up in peak memory usage.
    * Regarding correctness, the set of
    [resolved](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/lib/LTO/LTO.cpp#L739)
    [per-module
    symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L188-L191)
    is a subset of
    [llvm::lto::InputFile::Symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L120).
    And bitcode symbol parsing saves symbol name when iterating
    `obj->symbols` in `BitcodeFile::parse` already. This change updates
    `BitcodeFile::parseLazy` to keep copies of per-module undefined symbols.
    * Presumably the undefined symbols in a LTO unit (copied in this patch
    in linker unique saver) is a small set compared with the set of symbols
    in global-resolution (copied before this patch), making this a
    worthwhile trade-off. Benchmarking this change alone shows measurable
    memory savings across various benchmarks.
    
    [1] ELF
    https://github.com/llvm/llvm-project/blob/1cea5c2138bef3d8fec75508df6dbb858e6e3560/lld/ELF/InputFiles.cpp#L1748
    [2]
    https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2863
    [3]
    https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2995
    minglotus-6 authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    09b231c View commit details
    Browse the repository at this point in the history
  20. [libc++] Cache file attributes during directory iteration (llvm#93316)

    This patch adds caching of file attributes during directory iteration
    on Windows. This improves the performance when working with files being
    iterated on in a directory.
    ed-sat authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    b1b9b7b View commit details
    Browse the repository at this point in the history
  21. [clang, hexagon] Update copyright, license text (llvm#107161)

    When this file was first contributed - `28b01c59c93d ([hexagon] Add
    {hvx,}hexagon_{protos,circ_brev...}, 2021-06-30)` - I incorrectly
    included a QuIC copyright statement with "All rights reserved". I should
    have contributed this file with the `Apache+LLVM exception` license.
    androm3da authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    048e46a View commit details
    Browse the repository at this point in the history
  22. [MLGO] Remove unused imports

    Remove unused imports from python files in the MLGO library.
    boomanaiden154 committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    02fff93 View commit details
    Browse the repository at this point in the history
  23. Revert "[Coverage] Ignore unused functions if the count is 0." (llvm#…

    …107901)
    
    Reverts llvm#107661
    
    Breaks llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp
    ZequanWu authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    a7c26aa View commit details
    Browse the repository at this point in the history
  24. [MLGO] Fix logging verbosity in scripts (llvm#107818)

    This patch fixes issues related to logging verbosity in the MLGO python
    scripts. This was an oversight when converting from absl.logging to the
    python logging API as absl natively supports a --verbosity flag to set
    the desired logging level. This patch adds a flag to support similar
    functionality in Python's logging library and additionally updates
    docstrings where relevant to point to the new values.
    boomanaiden154 authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    99ea357 View commit details
    Browse the repository at this point in the history
  25. [NFC][TableGen] DirectiveEmitter code cleanup (llvm#107775)

    Eliminate unnecessary llvm:: prefix as this code is in llvm namespace. 
    Use ArrayRef<> instead of std::vector references when appropriate. 
    Use .empty() instead of .size() == 0.
    jurahul authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    78c1009 View commit details
    Browse the repository at this point in the history
  26. [SystemZ][z/OS] Enable lit testing for z/OS (llvm#107631)

    This patch fixes various errors to enable llvm-lit to run on z/OS
    abhina-sree authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    eec1ee8 View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    6776d65 View commit details
    Browse the repository at this point in the history
  28. Configuration menu
    Copy the full SHA
    ab82f83 View commit details
    Browse the repository at this point in the history
  29. Revert "[Clang][Sema] Use the correct lookup context when building ov…

    …erloaded 'operator->' in the current instantiation (llvm#104458)"
    
    This reverts commit 3cdb30e.
    
    Breaks clang bootstrap.
    nikic committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    3681d85 View commit details
    Browse the repository at this point in the history
  30. Configuration menu
    Copy the full SHA
    98815f7 View commit details
    Browse the repository at this point in the history
  31. [z/OS] Set the default arch for z/OS to be arch10 (llvm#89854)

    The default arch level on z/OS is arch10. Update the code so z/OS has
    arch10 without changing the default for zLinux.
    perry-ca authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    e62bf7c View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    b3d2d50 View commit details
    Browse the repository at this point in the history
  33. [TableGen] Migrate CodeGenHWModes to use const RecordKeeper (llvm#107851

    )
    
    Migrate CodeGenHWModes to use const RecordKeeper and const Record
    pointers.
    
    This is a part of effort to have better const correctness in TableGen
    backends:
    
    
    https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
    jurahul authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    985600d View commit details
    Browse the repository at this point in the history
  34. [DirectX] Lower @llvm.dx.typedBufferLoad to DXIL ops

    The `@llvm.dx.typedBufferLoad` intrinsic is lowered to `@dx.op.bufferLoad`.
    There's some complexity here in translating to scalarized IR, which I've
    abstracted out into a function that should be useful for samples, gathers, and
    CBuffer loads.
    
    I've also updated the DXILResources.rst docs to match what I'm doing here and
    the proposal in llvm/wg-hlsl#59. I've removed the content about stores and raw
    buffers for now with the expectation that it will be added along with the work.
    
    Note that this change includes a bit of a hack in how it deals with
    `getOverloadKind` for the `dx.ResRet` types - we need to adjust how we deal
    with operation overloads to generate a table directly rather than proxy through
    the OverloadKind enum, but that's left for a later change here.
    
    Part of llvm#91367
    
    Pull Request: llvm#104252
    bogner authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    3f22756 View commit details
    Browse the repository at this point in the history
  35. [VPlan] Consistently use VTC for vector trip count in vplan-printing.ll.

    The inconsistency surfaced in
    llvm#95305. Split off the reduce
    the diff.
    fhahn committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    3403438 View commit details
    Browse the repository at this point in the history
  36. Reland [asan][windows] Eliminate the static asan runtime on windows (l…

    …lvm#107899)
    
    This reapplies 8fa66c6 ([asan][windows]
    Eliminate the static asan runtime on windows) for a second time.
    
    That PR bounced off the tests because it caused failures in the other
    sanitizer runtimes, these have been fixed by only building interception,
    sanitizer_common, and asan with /MD, and continuing to build the rest of
    the runtimes with /MT. This does mean that any usage of the static
    ubsan/fuzzer/etc runtimes will mean you're mixing different runtime
    library linkages in the same app, the interception, sanitizer_common,
    and asan runtimes are designed for this, however it does result in some
    linker warnings.
    
    Additionally, it turns out when building in release-mode with
    LLVM_ENABLE_PDBs the build system forced /OPT:ICF. This totally breaks
    asan's "new" method of doing "weak" functions on windows, and so
    /OPT:NOICF was explicitly added to asan's link flags.
    
    ---------
    
    Co-authored-by: Amy Wishnousky <[email protected]>
    barcharcraz and amyw-msft authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    53a81d4 View commit details
    Browse the repository at this point in the history
  37. [SandboxIR] Add missing VectorType functions (llvm#107650)

    Fills in many missing functions from VectorType
    Sterling-Augustine authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    6f8d278 View commit details
    Browse the repository at this point in the history
  38. [scudo] Add fragmentation info for each memory group (llvm#107475)

    This information helps with tuning the heuristic of selecting memory
    groups to release the unused pages.
    ChiaHungDuan authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    d9a9960 View commit details
    Browse the repository at this point in the history
  39. [LTO] Fix a use-after-free in legacy LTO C APIs (llvm#107896)

    Fix a bug that `lto_runtime_lib_symbols_list` is returning the address
    of a local variable that will be freed when getting out of scope. This
    is a regression from llvm#98512 that rewrites the runtime libcall function
    lists into a SmallVector.
    
    rdar://135559037
    cachemeifyoucan authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    66e9078 View commit details
    Browse the repository at this point in the history
  40. [SPIRV] Add sign intrinsic part 1 (llvm#101987)

    partially fixes llvm#70078
    
    ### Changes
    - Added `int_spv_sign` intrinsic in `IntrinsicsSPIRV.td`
    - Added lowering and map to `int_spv_sign in
    `SPIRVInstructionSelector.cpp`
    - Added SPIR-V backend test case in
    `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/sign.ll`
    
    ### Related PRs
    - llvm#101988
    - llvm#101989
    tgymnich authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    a9a5a18 View commit details
    Browse the repository at this point in the history
  41. [TableGen] Change CGIOperandList::OperandInfo::Rec to const pointer (l…

    …lvm#107858)
    
    Change CGIOperandList::OperandInfo::Rec and CGIOperandList::TheDef to
    const pointer.
    
    This is a part of effort to have better const correctness in TableGen
    backends:
    
    
    https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
    jurahul authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    bdf0224 View commit details
    Browse the repository at this point in the history
  42. [SandboxVec] Implement Pass class (llvm#107617)

    This patch implements the Pass base class and the FunctionPass sub-class
    that operate on Sandbox IR.
    vporpo authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    f12e10b View commit details
    Browse the repository at this point in the history
  43. [NVPTX] Restrict combining to properly aligned v16i8 vectors. (llvm#1…

    …07919)
    
    Fixes generation of invalid loads leading to misaligned access errors.
    The bug got exposed by SLP vectorizer change ec360d6 which allowed SLP
    to produce `v16i8` vectors.
    
    Also updated the tests to use automatic check generator.
    Artem-B authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    26b786a View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    d148a1a View commit details
    Browse the repository at this point in the history
  45. [X86] Handle shifts + and in LowerSELECTWithCmpZero

    shifts are the same as sub where rhs == 0 is identity.
    and is the inverted case where:
        `SELECT (AND(X,1) == 0), (AND Y, Z), Y`
            -> `(AND Y, (OR NEG(AND(X, 1)), Z))`
    With -1 as the identity.
    
    Closes llvm#107910
    goldsteinn committed Sep 9, 2024
    Configuration menu
    Copy the full SHA
    88bd507 View commit details
    Browse the repository at this point in the history
  46. [PAC] Make __is_function_overridden pauth-aware on ELF platforms (llv…

    …m#107498)
    
    Apparently, there are two almost identical implementations: one for
    MachO and another one for ELF. The ELF bits somehow slipped while
    llvm#84573 was reviewed.
    
    The particular implementation is identical to MachO case.
    asl authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    33c1325 View commit details
    Browse the repository at this point in the history
  47. [SandboxIR] Implement UndefValue (llvm#107628)

    This patch implements sandboxir::UndefValue mirroring llvm::UndefValue.
    vporpo authored Sep 9, 2024
    Configuration menu
    Copy the full SHA
    ae02211 View commit details
    Browse the repository at this point in the history

Commits on Sep 10, 2024

  1. Configuration menu
    Copy the full SHA
    81ef8e2 View commit details
    Browse the repository at this point in the history
  2. [NVPTX] Support copysign PTX instruction (llvm#107800)

    Lower `fcopysign` SDNodes into `copysign` PTX instructions where
    possible. See [PTX ISA: 9.7.3.2. Floating Point Instructions: copysign]
    (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-copysign).
    AlexMaclean authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    b0d2411 View commit details
    Browse the repository at this point in the history
  3. [ctx_prof] Insert the ctx prof flattener after the module inliner (ll…

    …vm#107499)
    
    This patch enables experimenting with the contextual profile. ICP is currently disabled in this case - will reenable it subsequently. Also subsequently the inline cost model / decision making would be updated to be context-aware. Right now, this just achieves "complete use" of the profile, in that it's ingested, maintained, and sunk to a flat profile when not needed anymore.
    
    Issue [llvm#89287](llvm#89287)
    mtrofin authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    3b22618 View commit details
    Browse the repository at this point in the history
  4. [mlir][linalg][NFC] Drop redundant rankReductionStrategy (llvm#107875)

    This patch drop redundant rankReductionStrategy in
    `populateFoldUnitExtentDimsViaSlicesPatterns` and fixes comment typos.
    CoTinker authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    f3b4e47 View commit details
    Browse the repository at this point in the history
  5. [LoongArch][ISel] Check the number of sign bits in PatGprGpr_32 (l…

    …lvm#107432)
    
    After llvm#92205, LoongArch ISel
    selects `div.w` for `trunc i64 (sdiv i64 3202030857, (sext i32 X to
    i64)) to i32`. It is incorrect since `3202030857` is not a signed 32-bit
    constant. It will produce wrong result when `X == 2`:
    https://alive2.llvm.org/ce/z/pzfGZZ
    
    This patch adds additional `sexti32` checks to operands of
    `PatGprGpr_32`.
    Alive2 proof: https://alive2.llvm.org/ce/z/AkH5Mp
    
    Fix llvm#107414.
    dtcxzyw authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    a111f91 View commit details
    Browse the repository at this point in the history
  6. [NFC][TableGen] Simplify DirectiveEmitter using range for loops (llvm…

    …#107909)
    
    Make constructors that take const Record * implicit, allowing us to
    simplify some range based loops to use that class instance as the loop
    variable.
    
    Change remaining constructor calls to use () instead of {} to construct
    objects.
    jurahul authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    f7479b5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    e64a1c0 View commit details
    Browse the repository at this point in the history
  8. [LoongArch] Codegen for concat_vectors with LASX

    Fixes: llvm#107355
    
    Reviewed By: SixWeining
    
    Pull Request: llvm#107523
    wangleiat authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    1ca411c View commit details
    Browse the repository at this point in the history
  9. [bazel][libc][NFC] Add missing layering deps (llvm#107947)

    After 2773719
    
    e.g.
    
    ```
    external/llvm-project/libc/test/src/math/smoke/NextTowardTest.h:12:10: error: module llvm-project//libc/test/src/math/smoke:nexttowardf_test does not depend on a module exporting 'src/__support/CPP/bit.h'
    ```
    rupprecht authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    7a8e9df View commit details
    Browse the repository at this point in the history
  10. [LLVM][Coroutines] Switch CoroAnnotationElidePass to a FunctionPass (l…

    …lvm#107897)
    
    After landing llvm#99285 we found
    that the call graph update was causing the following crash when
    expensive checks are turned on
    ```
    llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:982: LazyCallGraph::SCC &updateCGAndAnalysisManagerForPass(LazyCallGraph &, LazyCallGraph::SCC &, LazyCallGraph::Node &, CGSCCAnalysisManager &, CGSCCUpdateResult &, FunctionAnalysisManager &, bool): Assertion `(RC == &TargetRC || RC->isAncestorOf(Targe
    tRC)) && "New call edge is not trivial!"' failed.                                                                                                                                                                                                                                                                               
    ```
    I have to admit I believe that the call graph update process I did for
    that patch could be wrong.
    
    After reading the code in `CGSCCToFunctionPassAdaptor`, I am convinced
    that `CoroAnnotationElidePass` can be a FunctionPass and rely on the
    adaptor to update the call graph for us, so long as we properly
    invalidate the caller's analyses.
    
    After this patch,
    `llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll` no longer
    fails under expensive checks.
    yuxuanchen1997 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    761bf33 View commit details
    Browse the repository at this point in the history
  11. [Fuzzer] Passthrough zlib CMake paths into the test (llvm#107926)

    We shouldn't assume that we're using system zlib installation.
    petrhosek authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    eb0e4b1 View commit details
    Browse the repository at this point in the history
  12. [ValueTracking] Infer is-power-of-2 from assumptions. (llvm#107745)

    This patch tries to infer is-power-of-2 from assumptions. I don't see
    that this kind of assumption exists in my dataset.
    Related issue: rust-lang/rust#129795
    
    Close llvm#58996.
    dtcxzyw authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    ffcff4a View commit details
    Browse the repository at this point in the history
  13. [clang] fix half && bfloat16 convert node expr codegen (llvm#89051)

    Data type conversion between fp16 and bf16 will generate fptrunc and
    fpextend nodes, but they are actually bitcast nodes.
    JinjinLi868 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    56905da View commit details
    Browse the repository at this point in the history
  14. [clang][HLSL] Add sign intrinsic part 3 (llvm#101989)

    partially fixes llvm#70078
    
    ### Changes
    - Implemented `sign` clang builtin
    - Linked `sign` clang builtin with `hlsl_intrinsics.h`
    - Added sema checks for `sign` to `CheckHLSLBuiltinFunctionCall` in
    `SemaChecking.cpp`
    - Add codegen for `sign` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp`
    - Add codegen tests to `clang/test/CodeGenHLSL/builtins/sign.hlsl`
    - Add sema tests to `clang/test/SemaHLSL/BuiltIns/sign-errors.hlsl`
    
    ### Related PRs
    - llvm#101987
    - llvm#101988
    
    ### Discussion
    - Should there be a `usign` intrinsic that handles the unsigned cases?
    tgymnich authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    dce5039 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    02ab435 View commit details
    Browse the repository at this point in the history
  16. [ORC] Remove EDU from dependants list of dependencies before destroying.

    Dependant lists hold raw pointers back to EDUs that depend on them. We need to
    remove these entries before destroying the EDU or we'll be left with a dangling
    reference that can result in use-after-free bugs.
    
    No testcase: This has only been observed in multi-threaded setups that
    reproduce the issue inconsistently.
    
    rdar://135403614
    lhames committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    7034ec4 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    094e6b8 View commit details
    Browse the repository at this point in the history
  18. [LLDB][Minidump] Support minidumps where there are multiple exception…

    … streams (llvm#97470)
    
    Currently, LLDB assumes all minidumps will have unique sections. This is
    intuitive because almost all of the minidump sections are themselves
    lists. Exceptions including Signals are unique in that they are all
    individual sections with their own directory.
    
    This means LLDB fails to load minidumps with multiple exceptions due to
    them not being unique. This behavior is erroneous and this PR introduces
    support for an arbitrary number of exception streams. Additionally, stop
    info was calculated only for a single thread before, and now we properly
    support mapping exceptions to threads.
    
    ~~This PR is starting in DRAFT because implementing testing is still
    required.~~
    Jlalond authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    4926835 View commit details
    Browse the repository at this point in the history
  19. [clang][bytecode] Fix local destructor order (llvm#107951)

    Add appropriate scopes and use reverse-order iteration in
    LocalScope::emitDestructors().
    tbaederr authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    3928ede View commit details
    Browse the repository at this point in the history
  20. [ORC-RT] Replace FnTag arg of WrapperFunction::call with generic disp…

    …atch arg.
    
    This decouples function argument serialization / deserialization from the
    function call dispatch mechanism. This will eventually allow us to replace the
    existing __orc_rt_jit_dispatch function with a system that supports pre-linking
    parts of the ORC runtime into the executor.
    lhames committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    462251b View commit details
    Browse the repository at this point in the history
  21. [ORC-RT] Fix typo in 462251b.

    lhames committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    9b67c99 View commit details
    Browse the repository at this point in the history
  22. [RISCV] Constrain passthru regclass in vmerge -> vmv peephole

    In llvm#107827 we now set true's passthru to the false operand if it was
    undef. We need to remember to also constrain the regclass in case true
    is a masked pseudo which needs its passthrus to be in VR[M*]NoV0
    lukel97 committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    b71d88c View commit details
    Browse the repository at this point in the history
  23. Revert "[RISCV] Update V0Defs after moving Src in peepholes (llvm#107359

    )"
    
    This fixes llvm#107950 and adds a test case for it. The issue was due to
    us incorrectly assuming that we stored a V0Defs entry for every single
    instruction.
    
    We actually only store them for instructions that use V0, so when we
    updated the V0Def after moving we sometimes ended up copying nullptr
    over from an instruction that doesn't use V0 and clearing the V0Def
    entry inadvertently.
    
    Because we don't have V0Defs on instructions that don't use V0, the
    FIXME was never actually needed in the first place since the
    bookkeeping wasn't out of sync to begin with.
    
    That commit also mentioned that a future unmasked to masked pseudo
    peephole might need unmasked pseudos to have V0Defs entries, but after
    working on this locally it turns out we don't.
    
    This reverts commit ce36480.
    lukel97 committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    7ba6768 View commit details
    Browse the repository at this point in the history
  24. [libc++][string] Remove potential non-trailing 0-length array (llvm#1…

    …05865)
    
    It is a violation of the standard to use 0 length arrays, especially
    when not at the end of a structure (not a FAM GNU extension). Compiler
    generally accept it, but it's probably better to have a conforming
    implementation.
    
    ---------
    
    Co-authored-by: Louis Dionne <[email protected]>
    serge-sans-paille and ldionne authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    ed0da00 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    06c3311 View commit details
    Browse the repository at this point in the history
  26. [GlobalIsel] Update MIR gallery (llvm#107903)

    add more patterns
    clarify wip_match_opcode usage
    tschuett authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    bece0d7 View commit details
    Browse the repository at this point in the history
  27. [llvm][Support] Determine the max thread length on Haiku (llvm#107801)

    Haiku has pthread_setname_np() / pthread_getname_np().
    brad0 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    1c334de View commit details
    Browse the repository at this point in the history
  28. Revert "[llvm-ml] Fix RIP-relative addressing for ptr operands (llvm#…

    …107618)"
    
    This reverts commit 7543d09.
    
    This change caused failed asserts when building the openmp assembly
    sources, reproducible with:
    
        $ llvm-ml -m64 -D_M_AMD64 -c -Fo out.obj openmp/runtime/src/z_Windows_NT-586_asm.asm
        llvm-ml: ../lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:624: void {anonymous}::X86MCCodeEmitter::emitMemModRMByte(const llvm::MCInst&, unsigned int, unsigned int, uint64_t, {anonymous}::PrefixKind, uint64_t, llvm::SmallVectorImpl<char>&, llvm::SmallVectorImpl<llvm::MCFixup>&, const llvm::MCSubtargetInfo&, bool) const: Assertion `IndexReg.getReg() == 0 && !ForceSIB && "Invalid rip-relative address"' failed.
    
    The assert can also be triggered with one lone instruction:
    
        lea rdx, QWORD PTR [rax*8+16]
    mstorsjo committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    1581183 View commit details
    Browse the repository at this point in the history
  29. [MLIR] Make resolveCallable customizable in CallOpInterface (llvm…

    …#100361)
    
    Allow customization of the `resolveCallable` method in the
    `CallOpInterface`. This change allows for operations implementing this
    interface to provide their own logic for resolving callables.
    
    - Introduce the `resolveCallable` method, which does not include the
    optional symbol table parameter. This method replaces the previously
    existing extra class declaration `resolveCallable`.
    
    - Introduce the `resolveCallableInTable` method, which incorporates the
    symbol table parameter. This method replaces the previous extra class
    declaration `resolveCallable` that used the optional symbol table
    parameter.
    xlauko authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    958f59d View commit details
    Browse the repository at this point in the history
  30. [MLIR][NVVM] Add support for nvvm.breakpoint Op (llvm#107193)

    This commit adds support for `nvvm.breakpoint` Op which lowers to the
    PTX brkpt instruction. Also, added the respective tests in `nvvmir.mlir`
    schwarzschild-radius authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    831236e View commit details
    Browse the repository at this point in the history
  31. Revert "[ORC-RT] Replace FnTag arg of WrapperFunction::call with gene…

    …ric dispatch arg."
    
    This reverts commit 462251b.
    This reverts commit 9b67c99.
    
    Build fails for compiler-rt/lib/orc/tests/unit/wrapper_function_utils_test.cpp
    
    https://buildkite.com/llvm-project/upstream-bazel/builds/109731#0191da59-6710-4420-92ef-aa6e0355cb2c
    metaflow committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    53d35c4 View commit details
    Browse the repository at this point in the history
  32. Revert "[MLIR] Make resolveCallable customizable in `CallOpInterfac…

    …e`" (llvm#107984)
    
    Reverts llvm#100361
    
    This commit caused some linker errors. (Missing `MLIRCallInterfaces`
    dependency.)
    matthias-springer authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    7574042 View commit details
    Browse the repository at this point in the history
  33. [mlir][SME] Update E2E test to show optional loop optimisation (NFC) (l…

    …lvm#107585)
    
    Introduces loop hoisting to ARM SME E2E tests to allow the hoisting of
    the tile load offering very important speedup.
    
    Discussed here :
    https://discourse.llvm.org/t/mlir-for-arm-sme-reducing-tile-data-transfers/80065/2
    nujaa authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    8aeb104 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    7e07c1d View commit details
    Browse the repository at this point in the history
  35. [MLIR] Add f6E3M2FN type (llvm#105573)

    This PR adds `f6E3M2FN` type to mlir.
    
    `f6E3M2FN` type is proposed in [OpenCompute MX
    Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf).
    It defines a 6-bit floating point number with bit layout S1E3M2. Unlike
    IEEE-754 types, there are no infinity or NaN values.
    
    ```c
    f6E3M2FN
    - Exponent bias: 3
    - Maximum stored exponent value: 7 (binary 111)
    - Maximum unbiased exponent value: 7 - 3 = 4
    - Minimum stored exponent value: 1 (binary 001)
    - Minimum unbiased exponent value: 1 − 3 = −2
    - Has Positive and Negative zero
    - Doesn't have infinity
    - Doesn't have NaNs
    
    Additional details:
    - Zeros (+/-): S.000.00
    - Max normal number: S.111.11 = ±2^(4) x (1 + 0.75) = ±28
    - Min normal number: S.001.00 = ±2^(-2) = ±0.25
    - Max subnormal number: S.000.11 = ±2^(-2) x 0.75 = ±0.1875
    - Min subnormal number: S.000.01 = ±2^(-2) x 0.25 = ±0.0625
    ```
    
    Related PRs:
    - [PR-94735](llvm#94735) [APFloat]
    Add APFloat support for FP6 data types
    - [PR-97118](llvm#97118) [MLIR] Add
    f8E4M3 type - was used as a template for this PR
    sergey-kozub authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    918222b View commit details
    Browse the repository at this point in the history
  36. Configuration menu
    Copy the full SHA
    083e25c View commit details
    Browse the repository at this point in the history
  37. [LoongArch] Eliminate the redundant sign extension of division (llvm#…

    …107971)
    
    If all incoming values of `div.d` are sign-extended and all users only
    use the lower 32 bits, then convert them to W versions.
    
    Fixes: llvm#107946
    heiher authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    0f47e3a View commit details
    Browse the repository at this point in the history
  38. [VectorCombine] Add type shrinking and zext propagation for fixed-wid…

    …th vector types (llvm#104606)
    
    Check that `binop(zext(value)`, other) is possible and profitable to transform
    into: `zext(binop(value, trunc(other)))`.
    When CPU architecture has illegal scalar type iX, but vector type <N * iX> is
    legal, scalar expressions before vectorisation may be extended to a legal
    type iY. This extension could result in underutilization of vector lanes,
    as more lanes could be used at one instruction with the lower type.
    Vectorisers may not always recognize opportunities for type shrinking, and
    this patch aims to address that limitation.
    igogo-x86 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    bf69484 View commit details
    Browse the repository at this point in the history
  39. [llvm][Docs] Update guide to include pip install lit (llvm#106526)

    Also updates and clarifies which version would be installed.
    
    As per
    https://discourse.llvm.org/t/information-on-lit-is-outdated/76498.
    MichelleCDjunaidi authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    edbe8fa View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    a99d666 View commit details
    Browse the repository at this point in the history
  41. [VPlan] Add VPValue for VF, use it for VPWidenIntOrFpInductionRecipe. (

    …llvm#95305)
    
    Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the
    runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only
    generated if there are users of VF, to avoid unnecessary test changes.
    
    PR: llvm#95305
    fhahn authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    a794ee4 View commit details
    Browse the repository at this point in the history
  42. [TOSA] tosa.negate operator lowering update (llvm#107924)

    This PR makes tosa.negate op for integer types to use the simplified
    calculation branch if input_zp and output_zp values are also zero.
    
    Signed-off-by: Dmitriy Smirnov <[email protected]>
    d-smirnov authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    2778d9d View commit details
    Browse the repository at this point in the history
  43. Re-apply "[ORC-RT] Replace FnTag arg of WrapperFunction::call..." wit…

    …h fixes.
    
    This reapplies commits 462251b and 9b67c99, which were reverted in
    53d35c4 due to bot failures for the wrapper_function_utils_test.cpp unit
    test.
    lhames committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    69f8923 View commit details
    Browse the repository at this point in the history
  44. [AArch64] Lower __builtin_bswap16 to rev16 if bswap followed by any_e…

    …xtend (llvm#105375)
    
    GCC compiles the built-in function `__builtin_bswap16`, to the ARM
    instruction rev16, which reverses the byte order of 16-bit data. On the
    other Clang compiles the same built-in function to e.g.
    ```     
            rev     w8, w0
            lsr     w0, w8, llvm#16
    ```
    i.e. it performs a byte reversal of a 32-bit register, (which moves the
    lower half, which contains the 16-bit data, to the upper half) and then
    right shifts the reversed 16-bit data back to the lower half of the
    register.
    We can improve Clang codegen by generating `rev16` instead of `rev` and
    `lsr`, like GCC.
    adprasad-nvidia authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    23595d1 View commit details
    Browse the repository at this point in the history
  45. [LLVM][AArch64] Refactor sve-b16b16 instruction definitions. (llvm#10…

    …7265)
    
    Update the predicate protecting bfloat instructions to only reference
    FEAT_SVE_B16B16, which matches the specification.
    
    Rename and move instruction classes to match the names of the encoding
    groups the bfloat arithmetic instructions belong.
    paulwalker-arm authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    516f08b View commit details
    Browse the repository at this point in the history
  46. [Flang][Lower] Introduce SymMapScope helper class (NFC) (llvm#107866)

    This patch creates a simple RAII wrapper class for `SymMap` to make it
    easier to use and prevent a missing matching `popScope()` for a
    `pushScope()` call on simple use cases.
    
    Some push-pop pairs are replaced with instances of the new class by this
    patch.
    skatrak authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    433ca3e View commit details
    Browse the repository at this point in the history
  47. [bazel] Port 69f8923

    d0k committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    fffdd9e View commit details
    Browse the repository at this point in the history
  48. [lldb] Recurse through DW_AT_signature when looking for attributes (l…

    …lvm#107241)
    
    This allows e.g. DWARFDIE::GetName() to return the name of the type when
    looking at its declaration (which contains only
    DW_AT_declaration+DW_AT_signature). This is similar to how we recurse
    through DW_AT_specification when looking for a function name. Llvm dwarf
    parser has obtained the same functionality through llvm#99495.
    
    This fixes a bug where we would confuse a type like NS::Outer::Struct
    with NS::Struct (because NS::Outer (and its name) was in a type unit).
    labath authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    925b220 View commit details
    Browse the repository at this point in the history
  49. [amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (llvm#105822)

    This intrinsic is meant to be used in functions that have a "tail" that
    needs to be run with all the lanes enabled. The "tail" may contain
    complex control flow that makes it unsuitable for the use of the
    existing WWM intrinsics. Instead, we will pretend that the function
    starts with all the lanes enabled, then branches into the actual body of
    the function for the lanes that were meant to run it, and then finally
    all the lanes will rejoin and run the tail.
    
    As such, the intrinsic will return the EXEC mask for the body of the
    function, and is meant to be used only as part of a very limited pattern
    (for now only in amdgpu_cs_chain functions):
    
    ```
    entry:
      %func_exec = call i1 @llvm.amdgcn.init.whole.wave()
      br i1 %func_exec, label %func, label %tail
    
    func:
      ; ... stuff that should run with the actual EXEC mask
      br label %tail
    
    tail:
      ; ... stuff that runs with all the lanes enabled;
      ; can contain more than one basic block
    ```
    
    It's an error to use the result of this intrinsic for anything
    other than a branch (but unfortunately checking that in the verifier is
    non-trivial because SIAnnotateControlFlow will introduce an amdgcn.if
    between the intrinsic and the branch).
    
    The intrinsic is lowered to a SI_INIT_WHOLE_WAVE pseudo, which for now
    is expanded in si-wqm (which is where SI_INIT_EXEC is handled too);
    however the information that the function was conceptually started in
    whole wave mode is stored in the machine function info
    (hasInitWholeWave). This will be useful in prolog epilog insertion,
    where we can skip saving the inactive lanes for CSRs (since if the
    function started with all the lanes active, then there are no inactive
    lanes to preserve).
    rovka authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    44556e6 View commit details
    Browse the repository at this point in the history
  50. [clang][bytecode][NFC] Fix CallBI function signature

    This doesn't modify the PC, so pass OpPC as a copy.
    tbaederr committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    4687017 View commit details
    Browse the repository at this point in the history
  51. [lld][AArch64] Fix getImplicitAddend in big-endian mode. (llvm#107845)

    In AArch64, the endianness of instruction encodings is always little,
    whereas the endianness of data swaps between LE and BE modes. So
    getImplicitAddend must use the right one of read32() and read32le(), for
    data and code respectively. It was using read32() throughout, causing
    instructions to be read as big-endian in BE mode, getting the wrong
    addend.
    
    Fixed, and updated the existing test to check both endiannesses. The
    expected results for data must be byte-swapped, but the ones for code
    need no adjustment.
    statham-arm authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    daf2085 View commit details
    Browse the repository at this point in the history
  52. Configuration menu
    Copy the full SHA
    6a56f15 View commit details
    Browse the repository at this point in the history
  53. [AArch64] Prevent the AArch64LoadStoreOptimizer from reordering CFI i…

    …nstructions (llvm#101317)
    
    When AArch64LoadStoreOptimizer pass merges an SP update with a
    load/store instruction and needs to adjust unwind information either:
    * create the merged instruction at the location of the SP update
      (so no CFI  instructions are moved), or
    * only move a CFI instruction if the move would not reorder it across
      other CFI  instructions
    
    If neither of the above is possible, don't perform the optimisation.
    momchil-velikov authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    b0ffaa7 View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    306b08c View commit details
    Browse the repository at this point in the history
  55. [flang] Use LLVM dialect ops for stack save/restore in target-rewrite (

    …llvm#107879)
    
    Mostly NFC, I was bothered by the declaration that were always made even
    if unsued, and I think using LLVM Ops is nicer anyway with regards to
    side effects here.
    
    ```
    func.func private @llvm.stacksave.p0() -> !fir.ref<i8>
    func.func private @llvm.stackrestore.p0(!fir.ref<i8>)
    ```
    
    There are other places in lowering that are using the calls instead of
    the LLVM intrinsics, but I will deal with them another time (the issue
    there is mostly to get the proper address space for the llvm.ptr type).
    jeanPerier authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    cb30169 View commit details
    Browse the repository at this point in the history
  56. [libc++] Include the full set of libc++ transitive includes in the CS…

    …V files (llvm#107911)
    
    When we introduced the machinery for transitive includes validation, at
    some point we stopped including the full set of transitive includes in
    the CSV files and instead only tracked the set of public headers
    included *directly* by a top-level header.
    
    The reason for doing that was so that the CSV files containing
    "transitive" includes could be used to draw the dependency graph of
    libc++ headers. However, the downside was that it made the contents of
    the CSV files much harder to interpret.
    
    In particular, many changes that modify the CSV files do not in fact
    modify the effective set of transitive includes, which is confusing.
    This patch goes back to storing the full set of transitive includes in
    the CSV files and removes the ability to graph the libc++ includes
    directly from those CSV files, which we never actually used.
    ldionne authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    930915a View commit details
    Browse the repository at this point in the history
  57. Configuration menu
    Copy the full SHA
    bda9474 View commit details
    Browse the repository at this point in the history
  58. Configuration menu
    Copy the full SHA
    0ccc609 View commit details
    Browse the repository at this point in the history
  59. [gn] attempt to port 53a81d4 (win/asan dynamic runtime)

    Based on the output of llvm/utils/gn/build/sync_source_lists_from_cmake.py
    and reading the diff, but not actually tested on Windows.
    nico committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    4a63d62 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    4d55f0b View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    d1cad22 View commit details
    Browse the repository at this point in the history
  62. [gn] port eb0e4b1

    nico committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    e610a0e View commit details
    Browse the repository at this point in the history
  63. [NFC][AMDGPU][Driver] Move 'shouldSkipSanitizeOption' utility to AMDG…

    …PU. (llvm#107997)
    
    HIPAMDToolChain and AMDGPUOpenMPToolChain both depends on the
    "shouldSkipSanitizeOption" api to sanitize/not sanitize device code.
    ampandey-1995 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    5dd1c82 View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    f58312e View commit details
    Browse the repository at this point in the history
  65. [flang][AMDGPU] Convert math ops to AMD GPU library calls instead of …

    …libm calls (llvm#99517)
    
    This patch invokes a pass when compiling for an AMDGPU target to lower
    math operations to AMD GPU library calls library calls instead of libm
    calls.
    jsjodin authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    4290e34 View commit details
    Browse the repository at this point in the history
  66. Configuration menu
    Copy the full SHA
    69828c4 View commit details
    Browse the repository at this point in the history
  67. [SPIR-V] Expose an API call to initialize SPIRV target and translate …

    …input LLVM IR module to SPIR-V (llvm#107216)
    
    The goal of this PR is to facilitate integration of SPIRV Backend into
    misc 3rd party tools and libraries by means of exposing an API call that
    translate LLVM module to SPIR-V and write results into a string as
    binary SPIR-V output, providing diagnostics on fail and means of
    configuring translation in a style of command line options.
    
    An example of a use case may be Khronos Translator that provides
    bidirectional translation LLVM IR <=> SPIR-V, where LLVM IR => SPIR-V
    step may be substituted by the call to SPIR-V Backend API, implemented
    by this PR.
    VyacheslavLevytskyy authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    bca2b6d View commit details
    Browse the repository at this point in the history
  68. [libc++][test] LWG2593: Moved-from state of Allocators (llvm#107344)

    The resolution of LWG2593 didn't require the standard library
    implementation to change. It merely strengthened requirements on
    user-defined allocator types and allowed the implementation to make
    stronger assumptions. The status is tentatively set to Nothing To Do.
    
    However, `test_allocator` in libc++'s test suit needs to be fixed to
    conform to the strengthened requirements.
    
    Closes llvm#100220.
    frederick-vs-ja authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    46a76c3 View commit details
    Browse the repository at this point in the history
  69. [CGData][MachineOutliner] Global Outlining (llvm#90074)

    This commit introduces support for outlining functions across modules
    using codegen data generated from previous codegen. The codegen data
    currently manages the outlined hash tree, which records outlining
    instances that occurred locally in the past.
        
    The machine outliner now operates in one of three modes:
    
    1. CGDataMode::None: This is the default outliner mode that uses the
    suffix tree to identify (local) outlining candidates within a module.
    This mode is also used by (full)LTO to maintain optimal behavior with
    the combined module.
    2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical
    to the default mode, but it also publishes the stable hash sequences of
    instructions in the outlined functions into a local outlined hash tree.
    It then encodes this into the `__llvm_outline` section, which will be
    dead-stripped at link time.
    3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode
    reads a codegen data file (.cgdata) and initializes a global outlined
    hash tree. This tree is used to generate global outlining candidates.
    Note that the codegen data file has been post-processed with the raw
    `__llvm_outline` sections from all native objects using the
    `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline
    later).
    
    This depends on llvm#105398. After
    this PR, LLD (llvm#90166) and Clang
    (llvm#90304) will follow for each
    client side support.
    This is a patch for
    https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
    kyulee-com authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    0f52545 View commit details
    Browse the repository at this point in the history
  70. [gn build] Port bca2b6d

    nico committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    7190368 View commit details
    Browse the repository at this point in the history
  71. [gn build] Port f12e10b

    nico committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    2459679 View commit details
    Browse the repository at this point in the history
  72. [flang][OpenMP] Implement copyin for pointers and allocatables. (llvm…

    …#107425)
    
    The copyin clause currently forbids pointer and allocatable variables,
    which are allowed by the OpenMP 1.1 and 3.0 specifications respectively.
    DavidTruby authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    53b5902 View commit details
    Browse the repository at this point in the history
  73. [llvm-exegesis] Refactor getting register number from name to LLVMSta…

    …te (llvm#107895)
    
    This patch refactors the procedure of getting the register number from a
    register name to LLVMState rather than having individual users get the
    values themselves by getting a reference to the map from LLVMState. This
    is primarily intended to make some downstream usage in Gematria simpler,
    but also cleans up a little bit upstream by pulling the actual map
    searching out and just leaving error handling to the clients.
    
    The original getter is left to enable downstream migration in Gematria,
    particularly before it gets imported into google internal.
    boomanaiden154 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    5823ac0 View commit details
    Browse the repository at this point in the history
  74. [gn] port 0f52545

    nico committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    dfd7284 View commit details
    Browse the repository at this point in the history
  75. [libc][bazel] fix accidental rename

    metaflow committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    33f1235 View commit details
    Browse the repository at this point in the history
  76. Configuration menu
    Copy the full SHA
    13c14c6 View commit details
    Browse the repository at this point in the history
  77. Configuration menu
    Copy the full SHA
    8530329 View commit details
    Browse the repository at this point in the history
  78. Configuration menu
    Copy the full SHA
    19a2f17 View commit details
    Browse the repository at this point in the history
  79. [Lex] Avoid repeated hash lookups (NFC) (llvm#107963)

    MacroAnnotations has three std::optional fields.
    
    Functions makeDeprecation, makeRestrictExpansion, and makeFinal
    construct an instance of MacroAnnotations with one field initialized
    with a non-default value (that is, some value other than
    std::nullopt).
    
    Functions addMacroDeprecationMsg, addRestrictExpansionMsg, and
    addFinalLoc either create a new map entry with one field initialized
    with a non-default value or replaces one field of an existing map
    entry.
    
    We can do all this with a simple statement of the form:
    
      AnnotationInfos[II].FieldName = NonDefaultValue;
    
    which takes care of default initialization of the fields with
    std::nullopt when a requested map entry does not exist.
    kazutakahirata authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    9710085 View commit details
    Browse the repository at this point in the history
  80. [mlir] Reuse pack dest in tensor.pack decomposition (llvm#108025)

    In the `lowerPack` transform, there is a special case for lowering into
    a simple `tensor.pad` + `tensor.insert_slice`, but the destination
    becomes a newly created `tensor.empty`. This PR fixes the transform to
    reuse the original destination of the `tensor.pack`.
    Max191 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    e982d7f View commit details
    Browse the repository at this point in the history
  81. [lldb][test] TestDbgInfoContentVectorFromStdModule.py: skip test on D…

    …arwin (llvm#108003)
    
    This started failing on the macOS CI after
    llvm#106885:
    
    ```
      lldb-api :: commands/expression/import-std-module/vector-dbg-info-content/TestDbgInfoContentVectorFromStdModule.py
    
    "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang"  -std=c++11 -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64  -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h  -fno-limit-debug-info    -nostdlib++ -nostdinc++ -cxx-isystem /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1  --driver-mode=g++ -MT main.o -MD -MP -MF main.d -c -o main.o /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content/main.cpp
    "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang"  main.o -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64  -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h  -fno-limit-debug-info     -L/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -Wl,-rpath,/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -lc++ --driver-mode=g++ -o "a.out"
    ld: warning: ignoring duplicate libraries: '-lc++'
    codesign --entitlements /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/entitlements-macos.plist -s - "a.out"
    "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/./bin/dsymutil"  -o "a.out.dSYM" "a.out"
    
    
    runCmd: settings set target.import-std-module true
    
    output: 
    
    runCmd: expr std::reverse(a.begin(), a.end())
    
    Assertion failed: (isa<InjectedClassNameType>(Decl->TypeForDecl)), function getInjectedClassNameType, file ASTContext.cpp, line 5057.
    PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
    Stack dump:
    0.	HandleCommand(command = "expr std::reverse(a.begin(), a.end())")
    1.	<eof> parser at end of file
    2.	/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:54:1: instantiating function definition 'std::reverse<std::__wrap_iter<Foo *>>'
    3.	/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:47:58: instantiating function definition 'std::__reverse<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>, std::__wrap_iter<Foo *>>'
    4.	/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:40:1: instantiating function definition 'std::__reverse_impl<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>>'
    ```
    Michael137 authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    2bcab9b View commit details
    Browse the repository at this point in the history
  82. [Attributor] Keep track of reached returns in AAPointerInfo (llvm#107479

    )
    
    Instead of visiting call sites in Attribute::checkForAllUses, we now
    keep track of returns in AAPointerInfo and use the call site return
    information as required. This way, the user of
    AAPointerInfo(CallSite)Argument can determine if the call return should
    be visited. We do not collect them as "may accesses" in the
    AAPointerInfo(CallSite)Argument itself in case a return user is found.
    jdoerfert authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    56a0334 View commit details
    Browse the repository at this point in the history
  83. [RFC][C++20][Modules] Fix crash when function and lambda inside loade…

    …d from different modules (llvm#104512)
    
    Summary:
    Because AST loading code is lazy and happens in unpredictable order it
    could happen that function and lambda inside function can be loaded from
    different modules. In this case, captured DeclRefExpr won’t match the
    corresponding VarDecl inside function. In AST it looks like this:
    ```
    FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline
    |-also in ./folly-conv.h
    `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1>
      |-DeclStmt 0x555564f7ced8 <line:34:3, col:17>
      | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit
      |   `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0
      |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>'
      | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0
      | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing
      | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)'
      |   |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition
      |   | |-also in ./thrift_cpp2_base.h
      |   | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init
      |   |   |-DefaultConstructor defaulted_is_constexpr
      |   |   |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
      |   |   |-MoveConstructor exists simple trivial needs_implicit
      |   |   |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
      |   |   |-MoveAssignment
      |   |   `-Destructor simple irrelevant trivial constexpr needs_implicit
      |   `-CompoundStmt 0x555564f7d1a8 <col:58, col:75>
      |     `-ReturnStmt 0x555564f7d198 <col:60, col:67>
      |       `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture
      `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11>
        `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void'
    ```
    This diff changes AST deserialization to load lambdas inside canonical
    function declaration earlier right after the function to make sure that
    their canonical decl is loaded from the same module.
    
    Test Plan: check-clang
    dmpolukhin authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    d778689 View commit details
    Browse the repository at this point in the history
  84. Configuration menu
    Copy the full SHA
    bf68403 View commit details
    Browse the repository at this point in the history
  85. Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (ll…

    …vm#90074) llvm#108037 (llvm#108047)
    
    The previous `attempt to fix [CGData][MachineOutliner] Global Outlining
    (llvm#90074) llvm#108037` was incomplete because the
    `ImmutableModuleSummaryIndexWrapperPass` is now optional for the
    MachineOutliner pass.
    
    With this fix, the test file `CodeGen/AArch64/O3-pipeline.ll` shows no
    changes compared to its state before `[CGData][MachineOutliner] Global
    Outlining (llvm#90074)`.
    
    Co-authored-by: Kyungwoo Lee <[email protected]>
    kyulee-com and Kyungwoo Lee authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    ba2aa1d View commit details
    Browse the repository at this point in the history
  86. Fix for llvm/test/CodeGen/RISCV/O3-pipeline.ll (llvm#108050)

    The previous `Fix for Attempt to fix [CGData][MachineOutliner] Global
    Outlining (llvm#90074) llvm#108037 (llvm#108047)` somehow dropped this file.
    kyulee-com authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    2cfdcfb View commit details
    Browse the repository at this point in the history
  87. [RISCV] Separate more of scalar FP in CC_RISCV. NFC (llvm#107908)

    Scalar FP calling convention has gotten more complicated with recent
    changes to Zfinx/Zdinx, proposed addition of a GPRF16 register class,
    and using customReg for f16/bf16 and other FP types small than XLen.
    
    The previous code tried to share a single getReg and getMem call for
    many different cases. This patch separates all the FP register handling
    to the top of the function with their own getReg calls. The only
    exception is f64 with XLen==32, when we are out of FPRs or not able to
    use FPRs due to ABI.
    
    The way I've structured this, we no longer need to correct the LocVT for
    FP back to ValVT before the call to getMem.
    topperc authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    14b4356 View commit details
    Browse the repository at this point in the history
  88. Configuration menu
    Copy the full SHA
    c7a7767 View commit details
    Browse the repository at this point in the history
  89. [LLDB][Data Formatters] Calculate average and total time for summary …

    …providers within lldb (llvm#102708)
    
    This PR adds a statistics provider cache, which allows an individual
    target to keep a rolling tally of it's total time and number of
    invocations for a given summary provider. This information is then
    available in statistics dump to help slow summary providers, and gleam
    more into insight into LLDB's time use.
    Jlalond authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    22144e2 View commit details
    Browse the repository at this point in the history
  90. [libc] fix locale dependency for stdlib (llvm#108042)

    Address the following issue:
    ```
    ❯ ninja libc.test.src.__support.OSUtil.linux.vdso_test.__unit__
    [91/127] Building CXX object libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o
    FAILED: libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o 
    sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -std=gnu++17 -fpie -DLIBC_FULL_BUILD -ffreestanding -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -MF libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o.d -o libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp
    In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp:21:
    In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/UnitTest/ErrnoSetterMatcher.h:13:
    In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/FPUtil/fpbits_str.h:12:
    In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/CPP/string.h:20:
    /home/schrodingerzy/Documents/llvm-project/build/libc/include/stdlib.h:13:10: fatal error: 'llvm-libc-types/locale_t.h' file not found
       13 | #include "llvm-libc-types/locale_t.h"
          |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    1 error generated.
    [123/127] Building CXX object libc/test/UnitTest/CMakeFiles/LibcTest.unit.dir/LibcTestMain.cpp.o
    ninja: build stopped: subcommand failed.
    ```
    SchrodingerZhu authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    ce9f987 View commit details
    Browse the repository at this point in the history
  91. [MemProf] Streamline and avoid unnecessary context id duplication (ll…

    …vm#107918)
    
    Sort the list of calls such that those with the same stack ids are also
    sorted by function. This allows processing of all matching calls (that
    can share a context node) in bulk as they are all adjacent.
    
    This has 2 benefits:
    1. It reduces unnecessary work, specifically the handling to intersect
       the context ids with those along the graph edges for the stack ids,
       for calls that we know can share a node.
    2. It simplifies detecting when we have matching stack ids but don't
       need to duplicate context ids. Specifically, we were previously
       still duplicating context ids whenever we saw another call with the
       same stack ids, but that isn't necessary if they will share a context
       node. With this change we now only duplicate context ids if we see
       some that not only have the same ids but also are in different
       functions.
    
    This change reduced the amount of context id duplication and provided
    reductions in both both peak memory (~8%) and time (~%5) for a large
    target.
    teresajohnson authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    524a028 View commit details
    Browse the repository at this point in the history
  92. [ADT] Require base equality in indexed_accessor_iterator::operator==() (

    llvm#107856)
    
    Similarly to operator<(), equality-comparing iterators from different
    ranges must really be forbidden. The preconditions for being able to do
    `it1 < it2` and `it1 != it2` (or `it1 == it2` for the matter) ought to
    be the same. Thus, there's little sense in keeping explicit base object
    comparison in operator==() whilst having this is a precondition in
    operator<() and operator-() (e.g. used for std::distance() and such).
    andrey-golubev authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    7fb19cb View commit details
    Browse the repository at this point in the history
  93. [DirectX] Lower @llvm.dx.typedBufferStore to DXIL ops

    The `@llvm.dx.typedBufferStore` intrinsic is lowered to `@dx.op.bufferStore`.
    
    Pull Request: llvm#104253
    bogner authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    90e8411 View commit details
    Browse the repository at this point in the history
  94. Configuration menu
    Copy the full SHA
    c8ed2b8 View commit details
    Browse the repository at this point in the history
  95. [PowerPC] Fix assert exposed by PR 95931 in LowerBITCAST (llvm#108062)

    Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" 
    Fix by checking opcode and value type before calling getOperand.
    syzaara authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    22067a8 View commit details
    Browse the repository at this point in the history
  96. Revert "[NVPTX] Support copysign PTX instruction (llvm#107800)" (llvm…

    …#108066)
    
    This reverts commit b0d2411.
    
    Reverting because the original commit misses case of copysign from a
    constant.
    pranavk authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    02c943a View commit details
    Browse the repository at this point in the history
  97. Add DIExpression::foldConstantMath to CoroSplit (llvm#107933)

    The CoroSplit pass has it's own salvageDebugInfo implementation and it's
    DIExpressions do not get folded. Add a call to
    DIExpression::foldConstantMath in the CoroSplit pass to reduce the size
    of those DIExpressions.
    
    [The compile time tracker shows no significant increase in compile time
    either.](https://llvm-compile-time-tracker.com/compare.php?from=bdf02249e7f8f95177ff58c881caf219699acb98&to=e1c1c1759c06bc4c42f79eebdb0e3cd45219cef4&stat=instructions:u)
    
    rdar://134675402
    rastogishubham authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    7a91af4 View commit details
    Browse the repository at this point in the history
  98. [bazel] Add CGData targets/deps (llvm#108070)

    This is newly used as of 0f52545.
    
    The bulk of the targets were added earlier in
    9bb5556.
    rupprecht authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    feeb6aa View commit details
    Browse the repository at this point in the history
  99. [RISCV] Fix fneg.d/fabs.d aliasing handling for Zdinx. Add missing fm…

    …v.s/d aliases.
    
    We were missing test coverage for fneg.d/fabs.d for Zdinx. When I
    added it revealed it only worked on RV64. The assembler was not
    creating a GPRPair register class on RV32 so the alias couldn't match.
    The disassembler was also not using GPRPair registers preventing the
    aliases from printing in disassembly too.
    
    I've fixed the assembler by adding new parsing methods in an attempt
    to get decent diagnostics. This is hard since the mnemonics are
    ambiguous between D and Zdinx. Tests have been adjusted for some
    differences in what errors are reported first.
    topperc committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    5537ae8 View commit details
    Browse the repository at this point in the history
  100. [lldb-dap] Improve stackTrace and exceptionInfo DAP request handl…

    …ers (llvm#105905)
    
    Refactoring `stackTrace` to perform frame look ups in a more on-demand
    fashion to improve overall performance.
    
    Additionally adding additional information to the `exceptionInfo`
    request to report exception stacks there instead of merging the
    exception stack into the stack trace. The `exceptionInfo` request is
    only called if a stop event occurs with `reason='exception'`, which
    should mitigate the performance of `SBThread::GetCurrentException`
    calls.
    
    Adding unit tests for exception handling and stack trace supporting.
    ashgti authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    5b4100c View commit details
    Browse the repository at this point in the history
  101. [DirectX] Add DirectXTargetCodeGenInfo (llvm#104856)

    Adds target codegen info class for DirectX. For now it always translates
    `__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)`
    (`RWBuffer<int>`). More work is needed to determine the actual target
    exp type and parameters based on the resource handle attributes.
    
    Part 1/2 of llvm#95952
    hekota authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    becb03f View commit details
    Browse the repository at this point in the history
  102. [Coroutines] Move spill related methods to a Spill utils (llvm#107884)

    * Move code related to spilling into SpillUtils to help cleanup
    CoroFrame
    
    See RFC for more info:
    https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
    TylerNowicki authored Sep 10, 2024
    Configuration menu
    Copy the full SHA
    f4e2d7b View commit details
    Browse the repository at this point in the history
  103. [Coroutines] Split buildCoroutineFrame

    * Split buildCoroutineFrame into code related to normalization and code
      related to actually building the coroutine frame.
    * This will enable future specialization of buildCoroutineFrame for
      different ABIs.
    tnowicki committed Sep 10, 2024
    Configuration menu
    Copy the full SHA
    34559ad View commit details
    Browse the repository at this point in the history