-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2
[llvm/llvm-project][Coroutines] Split buildCoroutineFrame #2
Commits on Sep 9, 2024
-
[Clang][Sema] Use the correct lookup context when building overloaded…
… 'operator->' in the current instantiation (llvm#104458) Currently, clang erroneously rejects the following: ``` struct A { template<typename T> void f(); }; template<typename T> struct B { void g() { (*this)->template f<int>(); // error: no member named 'f' in 'B<T>' } A* operator->(); }; ``` This happens because `Sema::ActOnStartCXXMemberReference` does not adjust the `ObjectType` parameter when `ObjectType` is a dependent type (except when the type is a `PointerType` and the class member access is the `->` form). Since the (possibly adjusted) `ObjectType` parameter (`B<T>` in the above example) is passed to `Parser::ParseOptionalCXXScopeSpecifier`, we end up looking up `f` in `B` rather than `A`. This patch fixes the issue by identifying cases where the type of the object expression `T` is a dependent, non-pointer type and: - `T` is the current instantiation and lookup for `operator->` finds a member of the current instantiation, or - `T` has at least one dependent base case, and `operator->` is not found in the current instantiation and using `ASTContext::DependentTy` as the type of the object expression when the optional _nested-name-specifier_ is parsed. Fixes llvm#104268.
Configuration menu - View commit details
-
Copy full SHA for 3cdb30e - Browse repository at this point
Copy the full SHA 3cdb30eView commit details -
[analyzer] fix crash on binding to symbolic region with
void *
type (……llvm#107572) As reported in llvm#103714 (comment). CSA crashes on trying to bind value to symbolic region with `void *`. This happens when such region gets passed as inline asm input and engine tries to bind `UnknownVal` to that region. Fix it by changing type from void to char before calling `GetElementZeroRegion`
Configuration menu - View commit details
-
Copy full SHA for db6051d - Browse repository at this point
Copy the full SHA db6051dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 95753ff - Browse repository at this point
Copy the full SHA 95753ffView commit details -
[lld][WebAssembly] Fix use of uninitialized stack data with --wasm64 (l…
…lvm#107780) In the case of `--wasm64` we were setting the type of the init expression to be 64-bit but were only setting the low 32-bits of the value (by assigning to Int32). Fixes: emscripten-core/emscripten#22538
Configuration menu - View commit details
-
Copy full SHA for 5c8fd1e - Browse repository at this point
Copy the full SHA 5c8fd1eView commit details -
[CMake] Passthrough variables for packages to subbuilds (llvm#107611)
These packaged are imported by LLVMConfig.cmake and so we should be passing through the necessary variables from the parent build into the subbuilds. We use `CMAKE_CACHE_DEFAULT_ARGS` so subbuilds can override these variables if needed.
Configuration menu - View commit details
-
Copy full SHA for 60f052e - Browse repository at this point
Copy the full SHA 60f052eView commit details -
[NFCI][BitcodeReader]Read real GUID from VI as opposed to storing it …
…in map (llvm#107735) Currently, `ValueIdToValueInfoMap` [1] stores `std::tuple<ValueInfo, GlobalValue::GUID /* original GUID */, GlobalValue::GUID /* real GUID*/ >`. This change updates the stored value type to `std::pair<ValueInfo, GlobalValue::GUID /* original GUID */>`, and reads real GUID from ValueInfo. When an entry is inserted into `ValueIdToValueInfoMap`, ValueInfo is created or inserted using real GUID [2]. ValueInfo keeps a pointer to GlobalValueMap [3], using either `GUID` or `{GUID, Name}` [4] when reading per-module summaries to create a combined summary. [1] owned by per module-summary bitcode reader https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L947-L950 [2] [first](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7130-L7133), [second](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7221-L7222), [third](https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/lib/Bitcode/Reader/BitcodeReader.cpp#L7622-L7623) [3] https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1427-L1431 [4] https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1631 and https://github.com/llvm/llvm-project/blob/caebb4562ce634a22f7b13480b19cffc2a6a6730/llvm/include/llvm/IR/ModuleSummaryIndex.h#L1621 --------- Co-authored-by: Kazu Hirata <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7d37172 - Browse repository at this point
Copy the full SHA 7d37172View commit details -
[LTO] Simplify calculateCallGraphRoot (NFC) (llvm#107765)
The function returns an instance of FunctionSummary populated by calculateCallGraphRoot regardless of whether Edges is empty or not.
Configuration menu - View commit details
-
Copy full SHA for c36c462 - Browse repository at this point
Copy the full SHA c36c462View commit details -
Configuration menu - View commit details
-
Copy full SHA for 95831f0 - Browse repository at this point
Copy the full SHA 95831f0View commit details -
[flang][OpenMP] Don't abort when default is used on an invalid direct…
…ive (llvm#107586) The previous assert was not considering programs with semantic errors. Fixes llvm#107495 Fixes llvm#93437
Configuration menu - View commit details
-
Copy full SHA for 7f90479 - Browse repository at this point
Copy the full SHA 7f90479View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7543d09 - Browse repository at this point
Copy the full SHA 7543d09View commit details -
[flang][cuda] Support c_devptr in c_f_pointer intrinsic (llvm#107470)
This is an extension of CUDA Fortran. The iso_c_binding intrinsic can accept a `TYPE(c_devptr)` as its first argument. This patch relax the semantic check to accept it and update the lowering to unwrap the cptr field from the c_devptr.
Configuration menu - View commit details
-
Copy full SHA for cd8229b - Browse repository at this point
Copy the full SHA cd8229bView commit details -
Fix implicit conversion rank ordering (llvm#106811)
DXC prefers dimension-preserving conversions over precision-losing conversions. This means a double4 -> float4 conversion is preferred over a double4 -> double3 or double4 -> double conversion.
Configuration menu - View commit details
-
Copy full SHA for 6cc0138 - Browse repository at this point
Copy the full SHA 6cc0138View commit details -
This patch fixes: llvm/lib/Target/ARM/MCTargetDesc/ARMBaseInfo.h:214:5: error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
Configuration menu - View commit details
-
Copy full SHA for 34e3007 - Browse repository at this point
Copy the full SHA 34e3007View commit details -
[HLSL] Implement support for HLSL intrinsic - select (llvm#107129)
Implement support for HLSL intrinsic select. This would close issue llvm#75377
Configuration menu - View commit details
-
Copy full SHA for 0f349b7 - Browse repository at this point
Copy the full SHA 0f349b7View commit details -
[flang][Driver] Support -Xlinker in flang (llvm#107472)
Partially addresses: llvm#89888
Configuration menu - View commit details
-
Copy full SHA for 5f74671 - Browse repository at this point
Copy the full SHA 5f74671View commit details -
[Coverage] Ignore unused functions if the count is 0. (llvm#107661)
Relax the condition to ignore the case when count is 0. This fixes a bug on llvm@381e9d2. This was reported at https://discourse.llvm.org/t/coverage-from-multiple-test-executables/81024/.
Configuration menu - View commit details
-
Copy full SHA for 6850410 - Browse repository at this point
Copy the full SHA 6850410View commit details -
[CUDA/HIP] propagate -cuid to a host-only compilation. (llvm#107483)
Right now we're bailing out too early, and `-cuid` does not get set for the host-only compilations.
Configuration menu - View commit details
-
Copy full SHA for 4a501a4 - Browse repository at this point
Copy the full SHA 4a501a4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2773719 - Browse repository at this point
Copy the full SHA 2773719View commit details -
Re-apply "[NFCI][LTO][lld] Optimize away symbol copies within LTO glo…
…bal resolution in ELF" (llvm#107792) Fix the use-after-free bug and re-apply llvm#106193 * Without the fix, the string referenced by `objSym.Name` could be destroyed even if string saver keeps a copy of the referenced string. This caused use-after-free. * The fix ([latest commit](llvm@9776ed4)) updates `objSym.Name` to reference (via `StringRef`) the string saver's copy. Test: 1. For `lld/test/ELF/lto/asmundef.ll`, its test failure is reproducible with `-DLLVM_USE_SANITIZER=Address` and gone with the fix. 3. Run all tests by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes. * Without the fix, `ELF/lto/asmundef.ll` aborted the multi-stage test at `@@@BUILD_STEP stage2/asan_ubsan check@@@`, defined [here](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh#L30) * With the fix, the [multi-stage test](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh) pass stage2 {asan, ubsan, masan}. This is also the test used by https://lab.llvm.org/buildbot/#/builders/169 **Original commit message** `StringMap<T>` creates a [copy of the string](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMapEntry.h#L55-L58) for entry insertions and intentionally keep copies [since the implementation optimizes string memory usage](https://github.com/llvm/llvm-project/blob/d4c519e7b2ac21350ec08b23eda44bf4a2d3c974/llvm/include/llvm/ADT/StringMap.h#L124). On the other hand, linker keeps copies of symbol names [1] in `lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3]. This change proposes to optimize away string copies inside [LTO::GlobalResolutions](https://github.com/llvm/llvm-project/blob/24e791b4164986a1ca7776e3ae0292ef20d20c47/llvm/include/llvm/LTO/LTO.h#L409), which will make LTO indexing more memory efficient for ELF. There are similar opportunities for other (COFF, wasm, MachO) formats. The optimization takes place for lld (ELF) only. For the rest of use cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep copies and use global resolution key for de-duplication. Together with @kazutakahirata's work to make `ComputeCrossModuleImport` more memory efficient, we see a ~20% peak memory usage reduction in a binary where peak memory usage needs to go down. Thanks to the optimization in llvm@329ba52, the max (as opposed to the sum) of `ComputeCrossModuleImport` or `GlobalResolution` shows up in peak memory usage. * Regarding correctness, the set of [resolved](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/lib/LTO/LTO.cpp#L739) [per-module symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L188-L191) is a subset of [llvm::lto::InputFile::Symbols](https://github.com/llvm/llvm-project/blob/80c47ad3aec9d7f22e1b1bdc88960a91b66f89f1/llvm/include/llvm/LTO/LTO.h#L120). And bitcode symbol parsing saves symbol name when iterating `obj->symbols` in `BitcodeFile::parse` already. This change updates `BitcodeFile::parseLazy` to keep copies of per-module undefined symbols. * Presumably the undefined symbols in a LTO unit (copied in this patch in linker unique saver) is a small set compared with the set of symbols in global-resolution (copied before this patch), making this a worthwhile trade-off. Benchmarking this change alone shows measurable memory savings across various benchmarks. [1] ELF https://github.com/llvm/llvm-project/blob/1cea5c2138bef3d8fec75508df6dbb858e6e3560/lld/ELF/InputFiles.cpp#L1748 [2] https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2863 [3] https://github.com/llvm/llvm-project/blob/ef7b18a53c0d186dcda1e322be6035407fdedb55/lld/ELF/Driver.cpp#L2995
Configuration menu - View commit details
-
Copy full SHA for 09b231c - Browse repository at this point
Copy the full SHA 09b231cView commit details -
[libc++] Cache file attributes during directory iteration (llvm#93316)
This patch adds caching of file attributes during directory iteration on Windows. This improves the performance when working with files being iterated on in a directory.
Configuration menu - View commit details
-
Copy full SHA for b1b9b7b - Browse repository at this point
Copy the full SHA b1b9b7bView commit details -
[clang, hexagon] Update copyright, license text (llvm#107161)
When this file was first contributed - `28b01c59c93d ([hexagon] Add {hvx,}hexagon_{protos,circ_brev...}, 2021-06-30)` - I incorrectly included a QuIC copyright statement with "All rights reserved". I should have contributed this file with the `Apache+LLVM exception` license.
Configuration menu - View commit details
-
Copy full SHA for 048e46a - Browse repository at this point
Copy the full SHA 048e46aView commit details -
Remove unused imports from python files in the MLGO library.
Configuration menu - View commit details
-
Copy full SHA for 02fff93 - Browse repository at this point
Copy the full SHA 02fff93View commit details -
Revert "[Coverage] Ignore unused functions if the count is 0." (llvm#…
…107901) Reverts llvm#107661 Breaks llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp
Configuration menu - View commit details
-
Copy full SHA for a7c26aa - Browse repository at this point
Copy the full SHA a7c26aaView commit details -
[MLGO] Fix logging verbosity in scripts (llvm#107818)
This patch fixes issues related to logging verbosity in the MLGO python scripts. This was an oversight when converting from absl.logging to the python logging API as absl natively supports a --verbosity flag to set the desired logging level. This patch adds a flag to support similar functionality in Python's logging library and additionally updates docstrings where relevant to point to the new values.
Configuration menu - View commit details
-
Copy full SHA for 99ea357 - Browse repository at this point
Copy the full SHA 99ea357View commit details -
[NFC][TableGen] DirectiveEmitter code cleanup (llvm#107775)
Eliminate unnecessary llvm:: prefix as this code is in llvm namespace. Use ArrayRef<> instead of std::vector references when appropriate. Use .empty() instead of .size() == 0.
Configuration menu - View commit details
-
Copy full SHA for 78c1009 - Browse repository at this point
Copy the full SHA 78c1009View commit details -
[SystemZ][z/OS] Enable lit testing for z/OS (llvm#107631)
This patch fixes various errors to enable llvm-lit to run on z/OS
Configuration menu - View commit details
-
Copy full SHA for eec1ee8 - Browse repository at this point
Copy the full SHA eec1ee8View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6776d65 - Browse repository at this point
Copy the full SHA 6776d65View commit details -
Configuration menu - View commit details
-
Copy full SHA for ab82f83 - Browse repository at this point
Copy the full SHA ab82f83View commit details -
Revert "[Clang][Sema] Use the correct lookup context when building ov…
…erloaded 'operator->' in the current instantiation (llvm#104458)" This reverts commit 3cdb30e. Breaks clang bootstrap.
Configuration menu - View commit details
-
Copy full SHA for 3681d85 - Browse repository at this point
Copy the full SHA 3681d85View commit details -
Configuration menu - View commit details
-
Copy full SHA for 98815f7 - Browse repository at this point
Copy the full SHA 98815f7View commit details -
[z/OS] Set the default arch for z/OS to be arch10 (llvm#89854)
The default arch level on z/OS is arch10. Update the code so z/OS has arch10 without changing the default for zLinux.
Configuration menu - View commit details
-
Copy full SHA for e62bf7c - Browse repository at this point
Copy the full SHA e62bf7cView commit details -
Configuration menu - View commit details
-
Copy full SHA for b3d2d50 - Browse repository at this point
Copy the full SHA b3d2d50View commit details -
[TableGen] Migrate CodeGenHWModes to use const RecordKeeper (llvm#107851
) Migrate CodeGenHWModes to use const RecordKeeper and const Record pointers. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for 985600d - Browse repository at this point
Copy the full SHA 985600dView commit details -
[DirectX] Lower
@llvm.dx.typedBufferLoad
to DXIL opsThe `@llvm.dx.typedBufferLoad` intrinsic is lowered to `@dx.op.bufferLoad`. There's some complexity here in translating to scalarized IR, which I've abstracted out into a function that should be useful for samples, gathers, and CBuffer loads. I've also updated the DXILResources.rst docs to match what I'm doing here and the proposal in llvm/wg-hlsl#59. I've removed the content about stores and raw buffers for now with the expectation that it will be added along with the work. Note that this change includes a bit of a hack in how it deals with `getOverloadKind` for the `dx.ResRet` types - we need to adjust how we deal with operation overloads to generate a table directly rather than proxy through the OverloadKind enum, but that's left for a later change here. Part of llvm#91367 Pull Request: llvm#104252
Configuration menu - View commit details
-
Copy full SHA for 3f22756 - Browse repository at this point
Copy the full SHA 3f22756View commit details -
[VPlan] Consistently use VTC for vector trip count in vplan-printing.ll.
The inconsistency surfaced in llvm#95305. Split off the reduce the diff.
Configuration menu - View commit details
-
Copy full SHA for 3403438 - Browse repository at this point
Copy the full SHA 3403438View commit details -
Reland [asan][windows] Eliminate the static asan runtime on windows (l…
…lvm#107899) This reapplies 8fa66c6 ([asan][windows] Eliminate the static asan runtime on windows) for a second time. That PR bounced off the tests because it caused failures in the other sanitizer runtimes, these have been fixed by only building interception, sanitizer_common, and asan with /MD, and continuing to build the rest of the runtimes with /MT. This does mean that any usage of the static ubsan/fuzzer/etc runtimes will mean you're mixing different runtime library linkages in the same app, the interception, sanitizer_common, and asan runtimes are designed for this, however it does result in some linker warnings. Additionally, it turns out when building in release-mode with LLVM_ENABLE_PDBs the build system forced /OPT:ICF. This totally breaks asan's "new" method of doing "weak" functions on windows, and so /OPT:NOICF was explicitly added to asan's link flags. --------- Co-authored-by: Amy Wishnousky <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 53a81d4 - Browse repository at this point
Copy the full SHA 53a81d4View commit details -
[SandboxIR] Add missing VectorType functions (llvm#107650)
Fills in many missing functions from VectorType
Configuration menu - View commit details
-
Copy full SHA for 6f8d278 - Browse repository at this point
Copy the full SHA 6f8d278View commit details -
[scudo] Add fragmentation info for each memory group (llvm#107475)
This information helps with tuning the heuristic of selecting memory groups to release the unused pages.
Configuration menu - View commit details
-
Copy full SHA for d9a9960 - Browse repository at this point
Copy the full SHA d9a9960View commit details -
[LTO] Fix a use-after-free in legacy LTO C APIs (llvm#107896)
Fix a bug that `lto_runtime_lib_symbols_list` is returning the address of a local variable that will be freed when getting out of scope. This is a regression from llvm#98512 that rewrites the runtime libcall function lists into a SmallVector. rdar://135559037
Configuration menu - View commit details
-
Copy full SHA for 66e9078 - Browse repository at this point
Copy the full SHA 66e9078View commit details -
[SPIRV] Add sign intrinsic part 1 (llvm#101987)
partially fixes llvm#70078 ### Changes - Added `int_spv_sign` intrinsic in `IntrinsicsSPIRV.td` - Added lowering and map to `int_spv_sign in `SPIRVInstructionSelector.cpp` - Added SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/sign.ll` ### Related PRs - llvm#101988 - llvm#101989
Configuration menu - View commit details
-
Copy full SHA for a9a5a18 - Browse repository at this point
Copy the full SHA a9a5a18View commit details -
[TableGen] Change CGIOperandList::OperandInfo::Rec to const pointer (l…
…lvm#107858) Change CGIOperandList::OperandInfo::Rec and CGIOperandList::TheDef to const pointer. This is a part of effort to have better const correctness in TableGen backends: https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
Configuration menu - View commit details
-
Copy full SHA for bdf0224 - Browse repository at this point
Copy the full SHA bdf0224View commit details -
[SandboxVec] Implement Pass class (llvm#107617)
This patch implements the Pass base class and the FunctionPass sub-class that operate on Sandbox IR.
Configuration menu - View commit details
-
Copy full SHA for f12e10b - Browse repository at this point
Copy the full SHA f12e10bView commit details -
[NVPTX] Restrict combining to properly aligned v16i8 vectors. (llvm#1…
…07919) Fixes generation of invalid loads leading to misaligned access errors. The bug got exposed by SLP vectorizer change ec360d6 which allowed SLP to produce `v16i8` vectors. Also updated the tests to use automatic check generator.
Configuration menu - View commit details
-
Copy full SHA for 26b786a - Browse repository at this point
Copy the full SHA 26b786aView commit details -
Configuration menu - View commit details
-
Copy full SHA for d148a1a - Browse repository at this point
Copy the full SHA d148a1aView commit details -
[X86] Handle shifts + and in
LowerSELECTWithCmpZero
shifts are the same as sub where rhs == 0 is identity. and is the inverted case where: `SELECT (AND(X,1) == 0), (AND Y, Z), Y` -> `(AND Y, (OR NEG(AND(X, 1)), Z))` With -1 as the identity. Closes llvm#107910
Configuration menu - View commit details
-
Copy full SHA for 88bd507 - Browse repository at this point
Copy the full SHA 88bd507View commit details -
[PAC] Make __is_function_overridden pauth-aware on ELF platforms (llv…
…m#107498) Apparently, there are two almost identical implementations: one for MachO and another one for ELF. The ELF bits somehow slipped while llvm#84573 was reviewed. The particular implementation is identical to MachO case.
Configuration menu - View commit details
-
Copy full SHA for 33c1325 - Browse repository at this point
Copy the full SHA 33c1325View commit details -
[SandboxIR] Implement UndefValue (llvm#107628)
This patch implements sandboxir::UndefValue mirroring llvm::UndefValue.
Configuration menu - View commit details
-
Copy full SHA for ae02211 - Browse repository at this point
Copy the full SHA ae02211View commit details
Commits on Sep 10, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 81ef8e2 - Browse repository at this point
Copy the full SHA 81ef8e2View commit details -
[NVPTX] Support copysign PTX instruction (llvm#107800)
Lower `fcopysign` SDNodes into `copysign` PTX instructions where possible. See [PTX ISA: 9.7.3.2. Floating Point Instructions: copysign] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#floating-point-instructions-copysign).
Configuration menu - View commit details
-
Copy full SHA for b0d2411 - Browse repository at this point
Copy the full SHA b0d2411View commit details -
[ctx_prof] Insert the ctx prof flattener after the module inliner (ll…
…vm#107499) This patch enables experimenting with the contextual profile. ICP is currently disabled in this case - will reenable it subsequently. Also subsequently the inline cost model / decision making would be updated to be context-aware. Right now, this just achieves "complete use" of the profile, in that it's ingested, maintained, and sunk to a flat profile when not needed anymore. Issue [llvm#89287](llvm#89287)
Configuration menu - View commit details
-
Copy full SHA for 3b22618 - Browse repository at this point
Copy the full SHA 3b22618View commit details -
[mlir][linalg][NFC] Drop redundant rankReductionStrategy (llvm#107875)
This patch drop redundant rankReductionStrategy in `populateFoldUnitExtentDimsViaSlicesPatterns` and fixes comment typos.
Configuration menu - View commit details
-
Copy full SHA for f3b4e47 - Browse repository at this point
Copy the full SHA f3b4e47View commit details -
[LoongArch][ISel] Check the number of sign bits in
PatGprGpr_32
(l……lvm#107432) After llvm#92205, LoongArch ISel selects `div.w` for `trunc i64 (sdiv i64 3202030857, (sext i32 X to i64)) to i32`. It is incorrect since `3202030857` is not a signed 32-bit constant. It will produce wrong result when `X == 2`: https://alive2.llvm.org/ce/z/pzfGZZ This patch adds additional `sexti32` checks to operands of `PatGprGpr_32`. Alive2 proof: https://alive2.llvm.org/ce/z/AkH5Mp Fix llvm#107414.
Configuration menu - View commit details
-
Copy full SHA for a111f91 - Browse repository at this point
Copy the full SHA a111f91View commit details -
[NFC][TableGen] Simplify DirectiveEmitter using range for loops (llvm…
…#107909) Make constructors that take const Record * implicit, allowing us to simplify some range based loops to use that class instance as the loop variable. Change remaining constructor calls to use () instead of {} to construct objects.
Configuration menu - View commit details
-
Copy full SHA for f7479b5 - Browse repository at this point
Copy the full SHA f7479b5View commit details -
Configuration menu - View commit details
-
Copy full SHA for e64a1c0 - Browse repository at this point
Copy the full SHA e64a1c0View commit details -
[LoongArch] Codegen for concat_vectors with LASX
Fixes: llvm#107355 Reviewed By: SixWeining Pull Request: llvm#107523
Configuration menu - View commit details
-
Copy full SHA for 1ca411c - Browse repository at this point
Copy the full SHA 1ca411cView commit details -
[bazel][libc][NFC] Add missing layering deps (llvm#107947)
After 2773719 e.g. ``` external/llvm-project/libc/test/src/math/smoke/NextTowardTest.h:12:10: error: module llvm-project//libc/test/src/math/smoke:nexttowardf_test does not depend on a module exporting 'src/__support/CPP/bit.h' ```
Configuration menu - View commit details
-
Copy full SHA for 7a8e9df - Browse repository at this point
Copy the full SHA 7a8e9dfView commit details -
[LLVM][Coroutines] Switch CoroAnnotationElidePass to a FunctionPass (l…
…lvm#107897) After landing llvm#99285 we found that the call graph update was causing the following crash when expensive checks are turned on ``` llvm-project/llvm/lib/Analysis/CGSCCPassManager.cpp:982: LazyCallGraph::SCC &updateCGAndAnalysisManagerForPass(LazyCallGraph &, LazyCallGraph::SCC &, LazyCallGraph::Node &, CGSCCAnalysisManager &, CGSCCUpdateResult &, FunctionAnalysisManager &, bool): Assertion `(RC == &TargetRC || RC->isAncestorOf(Targe tRC)) && "New call edge is not trivial!"' failed. ``` I have to admit I believe that the call graph update process I did for that patch could be wrong. After reading the code in `CGSCCToFunctionPassAdaptor`, I am convinced that `CoroAnnotationElidePass` can be a FunctionPass and rely on the adaptor to update the call graph for us, so long as we properly invalidate the caller's analyses. After this patch, `llvm/test/Transforms/Coroutines/coro-transform-must-elide.ll` no longer fails under expensive checks.
Configuration menu - View commit details
-
Copy full SHA for 761bf33 - Browse repository at this point
Copy the full SHA 761bf33View commit details -
[Fuzzer] Passthrough zlib CMake paths into the test (llvm#107926)
We shouldn't assume that we're using system zlib installation.
Configuration menu - View commit details
-
Copy full SHA for eb0e4b1 - Browse repository at this point
Copy the full SHA eb0e4b1View commit details -
[ValueTracking] Infer is-power-of-2 from assumptions. (llvm#107745)
This patch tries to infer is-power-of-2 from assumptions. I don't see that this kind of assumption exists in my dataset. Related issue: rust-lang/rust#129795 Close llvm#58996.
Configuration menu - View commit details
-
Copy full SHA for ffcff4a - Browse repository at this point
Copy the full SHA ffcff4aView commit details -
[clang] fix half && bfloat16 convert node expr codegen (llvm#89051)
Data type conversion between fp16 and bf16 will generate fptrunc and fpextend nodes, but they are actually bitcast nodes.
Configuration menu - View commit details
-
Copy full SHA for 56905da - Browse repository at this point
Copy the full SHA 56905daView commit details -
[clang][HLSL] Add sign intrinsic part 3 (llvm#101989)
partially fixes llvm#70078 ### Changes - Implemented `sign` clang builtin - Linked `sign` clang builtin with `hlsl_intrinsics.h` - Added sema checks for `sign` to `CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp` - Add codegen for `sign` to `EmitHLSLBuiltinExpr` in `CGBuiltin.cpp` - Add codegen tests to `clang/test/CodeGenHLSL/builtins/sign.hlsl` - Add sema tests to `clang/test/SemaHLSL/BuiltIns/sign-errors.hlsl` ### Related PRs - llvm#101987 - llvm#101988 ### Discussion - Should there be a `usign` intrinsic that handles the unsigned cases?
Configuration menu - View commit details
-
Copy full SHA for dce5039 - Browse repository at this point
Copy the full SHA dce5039View commit details -
Configuration menu - View commit details
-
Copy full SHA for 02ab435 - Browse repository at this point
Copy the full SHA 02ab435View commit details -
[ORC] Remove EDU from dependants list of dependencies before destroying.
Dependant lists hold raw pointers back to EDUs that depend on them. We need to remove these entries before destroying the EDU or we'll be left with a dangling reference that can result in use-after-free bugs. No testcase: This has only been observed in multi-threaded setups that reproduce the issue inconsistently. rdar://135403614
Configuration menu - View commit details
-
Copy full SHA for 7034ec4 - Browse repository at this point
Copy the full SHA 7034ec4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 094e6b8 - Browse repository at this point
Copy the full SHA 094e6b8View commit details -
[LLDB][Minidump] Support minidumps where there are multiple exception…
… streams (llvm#97470) Currently, LLDB assumes all minidumps will have unique sections. This is intuitive because almost all of the minidump sections are themselves lists. Exceptions including Signals are unique in that they are all individual sections with their own directory. This means LLDB fails to load minidumps with multiple exceptions due to them not being unique. This behavior is erroneous and this PR introduces support for an arbitrary number of exception streams. Additionally, stop info was calculated only for a single thread before, and now we properly support mapping exceptions to threads. ~~This PR is starting in DRAFT because implementing testing is still required.~~
Configuration menu - View commit details
-
Copy full SHA for 4926835 - Browse repository at this point
Copy the full SHA 4926835View commit details -
[clang][bytecode] Fix local destructor order (llvm#107951)
Add appropriate scopes and use reverse-order iteration in LocalScope::emitDestructors().
Configuration menu - View commit details
-
Copy full SHA for 3928ede - Browse repository at this point
Copy the full SHA 3928edeView commit details -
[ORC-RT] Replace FnTag arg of WrapperFunction::call with generic disp…
…atch arg. This decouples function argument serialization / deserialization from the function call dispatch mechanism. This will eventually allow us to replace the existing __orc_rt_jit_dispatch function with a system that supports pre-linking parts of the ORC runtime into the executor.
Configuration menu - View commit details
-
Copy full SHA for 462251b - Browse repository at this point
Copy the full SHA 462251bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9b67c99 - Browse repository at this point
Copy the full SHA 9b67c99View commit details -
[RISCV] Constrain passthru regclass in vmerge -> vmv peephole
In llvm#107827 we now set true's passthru to the false operand if it was undef. We need to remember to also constrain the regclass in case true is a masked pseudo which needs its passthrus to be in VR[M*]NoV0
Configuration menu - View commit details
-
Copy full SHA for b71d88c - Browse repository at this point
Copy the full SHA b71d88cView commit details -
Revert "[RISCV] Update V0Defs after moving Src in peepholes (llvm#107359
)" This fixes llvm#107950 and adds a test case for it. The issue was due to us incorrectly assuming that we stored a V0Defs entry for every single instruction. We actually only store them for instructions that use V0, so when we updated the V0Def after moving we sometimes ended up copying nullptr over from an instruction that doesn't use V0 and clearing the V0Def entry inadvertently. Because we don't have V0Defs on instructions that don't use V0, the FIXME was never actually needed in the first place since the bookkeeping wasn't out of sync to begin with. That commit also mentioned that a future unmasked to masked pseudo peephole might need unmasked pseudos to have V0Defs entries, but after working on this locally it turns out we don't. This reverts commit ce36480.
Configuration menu - View commit details
-
Copy full SHA for 7ba6768 - Browse repository at this point
Copy the full SHA 7ba6768View commit details -
[libc++][string] Remove potential non-trailing 0-length array (llvm#1…
…05865) It is a violation of the standard to use 0 length arrays, especially when not at the end of a structure (not a FAM GNU extension). Compiler generally accept it, but it's probably better to have a conforming implementation. --------- Co-authored-by: Louis Dionne <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ed0da00 - Browse repository at this point
Copy the full SHA ed0da00View commit details -
Configuration menu - View commit details
-
Copy full SHA for 06c3311 - Browse repository at this point
Copy the full SHA 06c3311View commit details -
[GlobalIsel] Update MIR gallery (llvm#107903)
add more patterns clarify wip_match_opcode usage
Configuration menu - View commit details
-
Copy full SHA for bece0d7 - Browse repository at this point
Copy the full SHA bece0d7View commit details -
[llvm][Support] Determine the max thread length on Haiku (llvm#107801)
Haiku has pthread_setname_np() / pthread_getname_np().
Configuration menu - View commit details
-
Copy full SHA for 1c334de - Browse repository at this point
Copy the full SHA 1c334deView commit details -
Revert "[llvm-ml] Fix RIP-relative addressing for ptr operands (llvm#…
…107618)" This reverts commit 7543d09. This change caused failed asserts when building the openmp assembly sources, reproducible with: $ llvm-ml -m64 -D_M_AMD64 -c -Fo out.obj openmp/runtime/src/z_Windows_NT-586_asm.asm llvm-ml: ../lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp:624: void {anonymous}::X86MCCodeEmitter::emitMemModRMByte(const llvm::MCInst&, unsigned int, unsigned int, uint64_t, {anonymous}::PrefixKind, uint64_t, llvm::SmallVectorImpl<char>&, llvm::SmallVectorImpl<llvm::MCFixup>&, const llvm::MCSubtargetInfo&, bool) const: Assertion `IndexReg.getReg() == 0 && !ForceSIB && "Invalid rip-relative address"' failed. The assert can also be triggered with one lone instruction: lea rdx, QWORD PTR [rax*8+16]
Configuration menu - View commit details
-
Copy full SHA for 1581183 - Browse repository at this point
Copy the full SHA 1581183View commit details -
[MLIR] Make
resolveCallable
customizable inCallOpInterface
(llvm……#100361) Allow customization of the `resolveCallable` method in the `CallOpInterface`. This change allows for operations implementing this interface to provide their own logic for resolving callables. - Introduce the `resolveCallable` method, which does not include the optional symbol table parameter. This method replaces the previously existing extra class declaration `resolveCallable`. - Introduce the `resolveCallableInTable` method, which incorporates the symbol table parameter. This method replaces the previous extra class declaration `resolveCallable` that used the optional symbol table parameter.
Configuration menu - View commit details
-
Copy full SHA for 958f59d - Browse repository at this point
Copy the full SHA 958f59dView commit details -
[MLIR][NVVM] Add support for nvvm.breakpoint Op (llvm#107193)
This commit adds support for `nvvm.breakpoint` Op which lowers to the PTX brkpt instruction. Also, added the respective tests in `nvvmir.mlir`
Configuration menu - View commit details
-
Copy full SHA for 831236e - Browse repository at this point
Copy the full SHA 831236eView commit details -
Revert "[ORC-RT] Replace FnTag arg of WrapperFunction::call with gene…
…ric dispatch arg." This reverts commit 462251b. This reverts commit 9b67c99. Build fails for compiler-rt/lib/orc/tests/unit/wrapper_function_utils_test.cpp https://buildkite.com/llvm-project/upstream-bazel/builds/109731#0191da59-6710-4420-92ef-aa6e0355cb2c
Configuration menu - View commit details
-
Copy full SHA for 53d35c4 - Browse repository at this point
Copy the full SHA 53d35c4View commit details -
Revert "[MLIR] Make
resolveCallable
customizable in `CallOpInterfac……e`" (llvm#107984) Reverts llvm#100361 This commit caused some linker errors. (Missing `MLIRCallInterfaces` dependency.)
Configuration menu - View commit details
-
Copy full SHA for 7574042 - Browse repository at this point
Copy the full SHA 7574042View commit details -
[mlir][SME] Update E2E test to show optional loop optimisation (NFC) (l…
…lvm#107585) Introduces loop hoisting to ARM SME E2E tests to allow the hoisting of the tile load offering very important speedup. Discussed here : https://discourse.llvm.org/t/mlir-for-arm-sme-reducing-tile-data-transfers/80065/2
Configuration menu - View commit details
-
Copy full SHA for 8aeb104 - Browse repository at this point
Copy the full SHA 8aeb104View commit details -
[DAG] expandAVG - consistently use getShiftAmountConstant for constan…
…t shift amounts. NFC
Configuration menu - View commit details
-
Copy full SHA for 7e07c1d - Browse repository at this point
Copy the full SHA 7e07c1dView commit details -
[MLIR] Add f6E3M2FN type (llvm#105573)
This PR adds `f6E3M2FN` type to mlir. `f6E3M2FN` type is proposed in [OpenCompute MX Specification](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). It defines a 6-bit floating point number with bit layout S1E3M2. Unlike IEEE-754 types, there are no infinity or NaN values. ```c f6E3M2FN - Exponent bias: 3 - Maximum stored exponent value: 7 (binary 111) - Maximum unbiased exponent value: 7 - 3 = 4 - Minimum stored exponent value: 1 (binary 001) - Minimum unbiased exponent value: 1 − 3 = −2 - Has Positive and Negative zero - Doesn't have infinity - Doesn't have NaNs Additional details: - Zeros (+/-): S.000.00 - Max normal number: S.111.11 = ±2^(4) x (1 + 0.75) = ±28 - Min normal number: S.001.00 = ±2^(-2) = ±0.25 - Max subnormal number: S.000.11 = ±2^(-2) x 0.75 = ±0.1875 - Min subnormal number: S.000.01 = ±2^(-2) x 0.25 = ±0.0625 ``` Related PRs: - [PR-94735](llvm#94735) [APFloat] Add APFloat support for FP6 data types - [PR-97118](llvm#97118) [MLIR] Add f8E4M3 type - was used as a template for this PR
Configuration menu - View commit details
-
Copy full SHA for 918222b - Browse repository at this point
Copy the full SHA 918222bView commit details -
[MLIR] [NFC] Use APFloat semantics to get floating type width (llvm#1…
…07372) As suggested in the comments of llvm#105573
Configuration menu - View commit details
-
Copy full SHA for 083e25c - Browse repository at this point
Copy the full SHA 083e25cView commit details -
[LoongArch] Eliminate the redundant sign extension of division (llvm#…
…107971) If all incoming values of `div.d` are sign-extended and all users only use the lower 32 bits, then convert them to W versions. Fixes: llvm#107946
Configuration menu - View commit details
-
Copy full SHA for 0f47e3a - Browse repository at this point
Copy the full SHA 0f47e3aView commit details -
[VectorCombine] Add type shrinking and zext propagation for fixed-wid…
…th vector types (llvm#104606) Check that `binop(zext(value)`, other) is possible and profitable to transform into: `zext(binop(value, trunc(other)))`. When CPU architecture has illegal scalar type iX, but vector type <N * iX> is legal, scalar expressions before vectorisation may be extended to a legal type iY. This extension could result in underutilization of vector lanes, as more lanes could be used at one instruction with the lower type. Vectorisers may not always recognize opportunities for type shrinking, and this patch aims to address that limitation.
Configuration menu - View commit details
-
Copy full SHA for bf69484 - Browse repository at this point
Copy the full SHA bf69484View commit details -
[llvm][Docs] Update guide to include
pip install lit
(llvm#106526)Also updates and clarifies which version would be installed. As per https://discourse.llvm.org/t/information-on-lit-is-outdated/76498.
Configuration menu - View commit details
-
Copy full SHA for edbe8fa - Browse repository at this point
Copy the full SHA edbe8faView commit details -
Configuration menu - View commit details
-
Copy full SHA for a99d666 - Browse repository at this point
Copy the full SHA a99d666View commit details -
[VPlan] Add VPValue for VF, use it for VPWidenIntOrFpInductionRecipe. (…
…llvm#95305) Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only generated if there are users of VF, to avoid unnecessary test changes. PR: llvm#95305
Configuration menu - View commit details
-
Copy full SHA for a794ee4 - Browse repository at this point
Copy the full SHA a794ee4View commit details -
[TOSA] tosa.negate operator lowering update (llvm#107924)
This PR makes tosa.negate op for integer types to use the simplified calculation branch if input_zp and output_zp values are also zero. Signed-off-by: Dmitriy Smirnov <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2778d9d - Browse repository at this point
Copy the full SHA 2778d9dView commit details -
Re-apply "[ORC-RT] Replace FnTag arg of WrapperFunction::call..." wit…
Configuration menu - View commit details
-
Copy full SHA for 69f8923 - Browse repository at this point
Copy the full SHA 69f8923View commit details -
[AArch64] Lower __builtin_bswap16 to rev16 if bswap followed by any_e…
…xtend (llvm#105375) GCC compiles the built-in function `__builtin_bswap16`, to the ARM instruction rev16, which reverses the byte order of 16-bit data. On the other Clang compiles the same built-in function to e.g. ``` rev w8, w0 lsr w0, w8, llvm#16 ``` i.e. it performs a byte reversal of a 32-bit register, (which moves the lower half, which contains the 16-bit data, to the upper half) and then right shifts the reversed 16-bit data back to the lower half of the register. We can improve Clang codegen by generating `rev16` instead of `rev` and `lsr`, like GCC.
Configuration menu - View commit details
-
Copy full SHA for 23595d1 - Browse repository at this point
Copy the full SHA 23595d1View commit details -
[LLVM][AArch64] Refactor sve-b16b16 instruction definitions. (llvm#10…
…7265) Update the predicate protecting bfloat instructions to only reference FEAT_SVE_B16B16, which matches the specification. Rename and move instruction classes to match the names of the encoding groups the bfloat arithmetic instructions belong.
Configuration menu - View commit details
-
Copy full SHA for 516f08b - Browse repository at this point
Copy the full SHA 516f08bView commit details -
[Flang][Lower] Introduce SymMapScope helper class (NFC) (llvm#107866)
This patch creates a simple RAII wrapper class for `SymMap` to make it easier to use and prevent a missing matching `popScope()` for a `pushScope()` call on simple use cases. Some push-pop pairs are replaced with instances of the new class by this patch.
Configuration menu - View commit details
-
Copy full SHA for 433ca3e - Browse repository at this point
Copy the full SHA 433ca3eView commit details -
Configuration menu - View commit details
-
Copy full SHA for fffdd9e - Browse repository at this point
Copy the full SHA fffdd9eView commit details -
[lldb] Recurse through DW_AT_signature when looking for attributes (l…
…lvm#107241) This allows e.g. DWARFDIE::GetName() to return the name of the type when looking at its declaration (which contains only DW_AT_declaration+DW_AT_signature). This is similar to how we recurse through DW_AT_specification when looking for a function name. Llvm dwarf parser has obtained the same functionality through llvm#99495. This fixes a bug where we would confuse a type like NS::Outer::Struct with NS::Struct (because NS::Outer (and its name) was in a type unit).
Configuration menu - View commit details
-
Copy full SHA for 925b220 - Browse repository at this point
Copy the full SHA 925b220View commit details -
[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (llvm#105822)
This intrinsic is meant to be used in functions that have a "tail" that needs to be run with all the lanes enabled. The "tail" may contain complex control flow that makes it unsuitable for the use of the existing WWM intrinsics. Instead, we will pretend that the function starts with all the lanes enabled, then branches into the actual body of the function for the lanes that were meant to run it, and then finally all the lanes will rejoin and run the tail. As such, the intrinsic will return the EXEC mask for the body of the function, and is meant to be used only as part of a very limited pattern (for now only in amdgpu_cs_chain functions): ``` entry: %func_exec = call i1 @llvm.amdgcn.init.whole.wave() br i1 %func_exec, label %func, label %tail func: ; ... stuff that should run with the actual EXEC mask br label %tail tail: ; ... stuff that runs with all the lanes enabled; ; can contain more than one basic block ``` It's an error to use the result of this intrinsic for anything other than a branch (but unfortunately checking that in the verifier is non-trivial because SIAnnotateControlFlow will introduce an amdgcn.if between the intrinsic and the branch). The intrinsic is lowered to a SI_INIT_WHOLE_WAVE pseudo, which for now is expanded in si-wqm (which is where SI_INIT_EXEC is handled too); however the information that the function was conceptually started in whole wave mode is stored in the machine function info (hasInitWholeWave). This will be useful in prolog epilog insertion, where we can skip saving the inactive lanes for CSRs (since if the function started with all the lanes active, then there are no inactive lanes to preserve).
Configuration menu - View commit details
-
Copy full SHA for 44556e6 - Browse repository at this point
Copy the full SHA 44556e6View commit details -
[clang][bytecode][NFC] Fix CallBI function signature
This doesn't modify the PC, so pass OpPC as a copy.
Configuration menu - View commit details
-
Copy full SHA for 4687017 - Browse repository at this point
Copy the full SHA 4687017View commit details -
[lld][AArch64] Fix getImplicitAddend in big-endian mode. (llvm#107845)
In AArch64, the endianness of instruction encodings is always little, whereas the endianness of data swaps between LE and BE modes. So getImplicitAddend must use the right one of read32() and read32le(), for data and code respectively. It was using read32() throughout, causing instructions to be read as big-endian in BE mode, getting the wrong addend. Fixed, and updated the existing test to check both endiannesses. The expected results for data must be byte-swapped, but the ones for code need no adjustment.
Configuration menu - View commit details
-
Copy full SHA for daf2085 - Browse repository at this point
Copy the full SHA daf2085View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6a56f15 - Browse repository at this point
Copy the full SHA 6a56f15View commit details -
[AArch64] Prevent the AArch64LoadStoreOptimizer from reordering CFI i…
…nstructions (llvm#101317) When AArch64LoadStoreOptimizer pass merges an SP update with a load/store instruction and needs to adjust unwind information either: * create the merged instruction at the location of the SP update (so no CFI instructions are moved), or * only move a CFI instruction if the move would not reorder it across other CFI instructions If neither of the above is possible, don't perform the optimisation.
Configuration menu - View commit details
-
Copy full SHA for b0ffaa7 - Browse repository at this point
Copy the full SHA b0ffaa7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 306b08c - Browse repository at this point
Copy the full SHA 306b08cView commit details -
[flang] Use LLVM dialect ops for stack save/restore in target-rewrite (…
…llvm#107879) Mostly NFC, I was bothered by the declaration that were always made even if unsued, and I think using LLVM Ops is nicer anyway with regards to side effects here. ``` func.func private @llvm.stacksave.p0() -> !fir.ref<i8> func.func private @llvm.stackrestore.p0(!fir.ref<i8>) ``` There are other places in lowering that are using the calls instead of the LLVM intrinsics, but I will deal with them another time (the issue there is mostly to get the proper address space for the llvm.ptr type).
Configuration menu - View commit details
-
Copy full SHA for cb30169 - Browse repository at this point
Copy the full SHA cb30169View commit details -
[libc++] Include the full set of libc++ transitive includes in the CS…
…V files (llvm#107911) When we introduced the machinery for transitive includes validation, at some point we stopped including the full set of transitive includes in the CSV files and instead only tracked the set of public headers included *directly* by a top-level header. The reason for doing that was so that the CSV files containing "transitive" includes could be used to draw the dependency graph of libc++ headers. However, the downside was that it made the contents of the CSV files much harder to interpret. In particular, many changes that modify the CSV files do not in fact modify the effective set of transitive includes, which is confusing. This patch goes back to storing the full set of transitive includes in the CSV files and removes the ability to graph the libc++ includes directly from those CSV files, which we never actually used.
Configuration menu - View commit details
-
Copy full SHA for 930915a - Browse repository at this point
Copy the full SHA 930915aView commit details -
Configuration menu - View commit details
-
Copy full SHA for bda9474 - Browse repository at this point
Copy the full SHA bda9474View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0ccc609 - Browse repository at this point
Copy the full SHA 0ccc609View commit details -
[gn] attempt to port 53a81d4 (win/asan dynamic runtime)
Based on the output of llvm/utils/gn/build/sync_source_lists_from_cmake.py and reading the diff, but not actually tested on Windows.
Configuration menu - View commit details
-
Copy full SHA for 4a63d62 - Browse repository at this point
Copy the full SHA 4a63d62View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4d55f0b - Browse repository at this point
Copy the full SHA 4d55f0bView commit details -
Reland [MLIR] Make resolveCallable customizable in CallOpInterface (l…
…lvm#107989) Relands llvm#100361 with fixed dependencies.
Configuration menu - View commit details
-
Copy full SHA for d1cad22 - Browse repository at this point
Copy the full SHA d1cad22View commit details -
Configuration menu - View commit details
-
Copy full SHA for e610a0e - Browse repository at this point
Copy the full SHA e610a0eView commit details -
[NFC][AMDGPU][Driver] Move 'shouldSkipSanitizeOption' utility to AMDG…
…PU. (llvm#107997) HIPAMDToolChain and AMDGPUOpenMPToolChain both depends on the "shouldSkipSanitizeOption" api to sanitize/not sanitize device code.
Configuration menu - View commit details
-
Copy full SHA for 5dd1c82 - Browse repository at this point
Copy the full SHA 5dd1c82View commit details -
Configuration menu - View commit details
-
Copy full SHA for f58312e - Browse repository at this point
Copy the full SHA f58312eView commit details -
[flang][AMDGPU] Convert math ops to AMD GPU library calls instead of …
…libm calls (llvm#99517) This patch invokes a pass when compiling for an AMDGPU target to lower math operations to AMD GPU library calls library calls instead of libm calls.
Configuration menu - View commit details
-
Copy full SHA for 4290e34 - Browse repository at this point
Copy the full SHA 4290e34View commit details -
Configuration menu - View commit details
-
Copy full SHA for 69828c4 - Browse repository at this point
Copy the full SHA 69828c4View commit details -
[SPIR-V] Expose an API call to initialize SPIRV target and translate …
…input LLVM IR module to SPIR-V (llvm#107216) The goal of this PR is to facilitate integration of SPIRV Backend into misc 3rd party tools and libraries by means of exposing an API call that translate LLVM module to SPIR-V and write results into a string as binary SPIR-V output, providing diagnostics on fail and means of configuring translation in a style of command line options. An example of a use case may be Khronos Translator that provides bidirectional translation LLVM IR <=> SPIR-V, where LLVM IR => SPIR-V step may be substituted by the call to SPIR-V Backend API, implemented by this PR.
Configuration menu - View commit details
-
Copy full SHA for bca2b6d - Browse repository at this point
Copy the full SHA bca2b6dView commit details -
[libc++][test] LWG2593: Moved-from state of Allocators (llvm#107344)
The resolution of LWG2593 didn't require the standard library implementation to change. It merely strengthened requirements on user-defined allocator types and allowed the implementation to make stronger assumptions. The status is tentatively set to Nothing To Do. However, `test_allocator` in libc++'s test suit needs to be fixed to conform to the strengthened requirements. Closes llvm#100220.
Configuration menu - View commit details
-
Copy full SHA for 46a76c3 - Browse repository at this point
Copy the full SHA 46a76c3View commit details -
[CGData][MachineOutliner] Global Outlining (llvm#90074)
This commit introduces support for outlining functions across modules using codegen data generated from previous codegen. The codegen data currently manages the outlined hash tree, which records outlining instances that occurred locally in the past. The machine outliner now operates in one of three modes: 1. CGDataMode::None: This is the default outliner mode that uses the suffix tree to identify (local) outlining candidates within a module. This mode is also used by (full)LTO to maintain optimal behavior with the combined module. 2. CGDataMode::Write (`-codegen-data-generate`): This mode is identical to the default mode, but it also publishes the stable hash sequences of instructions in the outlined functions into a local outlined hash tree. It then encodes this into the `__llvm_outline` section, which will be dead-stripped at link time. 3. CGDataMode::Read (`-codegen-data-use-path={.cgdata}`): This mode reads a codegen data file (.cgdata) and initializes a global outlined hash tree. This tree is used to generate global outlining candidates. Note that the codegen data file has been post-processed with the raw `__llvm_outline` sections from all native objects using the `llvm-cgdata` tool (or a linker, `LLD`, or a new ThinLTO pipeline later). This depends on llvm#105398. After this PR, LLD (llvm#90166) and Clang (llvm#90304) will follow for each client side support. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.
Configuration menu - View commit details
-
Copy full SHA for 0f52545 - Browse repository at this point
Copy the full SHA 0f52545View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7190368 - Browse repository at this point
Copy the full SHA 7190368View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2459679 - Browse repository at this point
Copy the full SHA 2459679View commit details -
[flang][OpenMP] Implement copyin for pointers and allocatables. (llvm…
…#107425) The copyin clause currently forbids pointer and allocatable variables, which are allowed by the OpenMP 1.1 and 3.0 specifications respectively.
Configuration menu - View commit details
-
Copy full SHA for 53b5902 - Browse repository at this point
Copy the full SHA 53b5902View commit details -
[llvm-exegesis] Refactor getting register number from name to LLVMSta…
…te (llvm#107895) This patch refactors the procedure of getting the register number from a register name to LLVMState rather than having individual users get the values themselves by getting a reference to the map from LLVMState. This is primarily intended to make some downstream usage in Gematria simpler, but also cleans up a little bit upstream by pulling the actual map searching out and just leaving error handling to the clients. The original getter is left to enable downstream migration in Gematria, particularly before it gets imported into google internal.
Configuration menu - View commit details
-
Copy full SHA for 5823ac0 - Browse repository at this point
Copy the full SHA 5823ac0View commit details -
Configuration menu - View commit details
-
Copy full SHA for dfd7284 - Browse repository at this point
Copy the full SHA dfd7284View commit details -
Configuration menu - View commit details
-
Copy full SHA for 33f1235 - Browse repository at this point
Copy the full SHA 33f1235View commit details -
Configuration menu - View commit details
-
Copy full SHA for 13c14c6 - Browse repository at this point
Copy the full SHA 13c14c6View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8530329 - Browse repository at this point
Copy the full SHA 8530329View commit details -
[Format] Avoid repeated hash lookups (NFC) (llvm#107962)
Co-authored-by: Owen Pan <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 19a2f17 - Browse repository at this point
Copy the full SHA 19a2f17View commit details -
[Lex] Avoid repeated hash lookups (NFC) (llvm#107963)
MacroAnnotations has three std::optional fields. Functions makeDeprecation, makeRestrictExpansion, and makeFinal construct an instance of MacroAnnotations with one field initialized with a non-default value (that is, some value other than std::nullopt). Functions addMacroDeprecationMsg, addRestrictExpansionMsg, and addFinalLoc either create a new map entry with one field initialized with a non-default value or replaces one field of an existing map entry. We can do all this with a simple statement of the form: AnnotationInfos[II].FieldName = NonDefaultValue; which takes care of default initialization of the fields with std::nullopt when a requested map entry does not exist.
Configuration menu - View commit details
-
Copy full SHA for 9710085 - Browse repository at this point
Copy the full SHA 9710085View commit details -
[mlir] Reuse pack dest in tensor.pack decomposition (llvm#108025)
In the `lowerPack` transform, there is a special case for lowering into a simple `tensor.pad` + `tensor.insert_slice`, but the destination becomes a newly created `tensor.empty`. This PR fixes the transform to reuse the original destination of the `tensor.pack`.
Configuration menu - View commit details
-
Copy full SHA for e982d7f - Browse repository at this point
Copy the full SHA e982d7fView commit details -
[lldb][test] TestDbgInfoContentVectorFromStdModule.py: skip test on D…
…arwin (llvm#108003) This started failing on the macOS CI after llvm#106885: ``` lldb-api :: commands/expression/import-std-module/vector-dbg-info-content/TestDbgInfoContentVectorFromStdModule.py "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang" -std=c++11 -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64 -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h -fno-limit-debug-info -nostdlib++ -nostdinc++ -cxx-isystem /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1 --driver-mode=g++ -MT main.o -MD -MP -MF main.d -c -o main.o /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content/main.cpp "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang" main.o -g -O0 -isysroot "/Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk" -arch arm64 -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/../../../../..//include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/tools/lldb/include -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/expression/import-std-module/vector-dbg-info-content -I/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make -include /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/test_common.h -fno-limit-debug-info -L/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -Wl,-rpath,/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/lib -lc++ --driver-mode=g++ -o "a.out" ld: warning: ignoring duplicate libraries: '-lc++' codesign --entitlements /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/make/entitlements-macos.plist -s - "a.out" "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/./bin/dsymutil" -o "a.out.dSYM" "a.out" runCmd: settings set target.import-std-module true output: runCmd: expr std::reverse(a.begin(), a.end()) Assertion failed: (isa<InjectedClassNameType>(Decl->TypeForDecl)), function getInjectedClassNameType, file ASTContext.cpp, line 5057. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. HandleCommand(command = "expr std::reverse(a.begin(), a.end())") 1. <eof> parser at end of file 2. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:54:1: instantiating function definition 'std::reverse<std::__wrap_iter<Foo *>>' 3. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:47:58: instantiating function definition 'std::__reverse<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>, std::__wrap_iter<Foo *>>' 4. /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/include/c++/v1/__algorithm/reverse.h:40:1: instantiating function definition 'std::__reverse_impl<std::_ClassicAlgPolicy, std::__wrap_iter<Foo *>>' ```
Configuration menu - View commit details
-
Copy full SHA for 2bcab9b - Browse repository at this point
Copy the full SHA 2bcab9bView commit details -
[Attributor] Keep track of reached returns in AAPointerInfo (llvm#107479
) Instead of visiting call sites in Attribute::checkForAllUses, we now keep track of returns in AAPointerInfo and use the call site return information as required. This way, the user of AAPointerInfo(CallSite)Argument can determine if the call return should be visited. We do not collect them as "may accesses" in the AAPointerInfo(CallSite)Argument itself in case a return user is found.
Configuration menu - View commit details
-
Copy full SHA for 56a0334 - Browse repository at this point
Copy the full SHA 56a0334View commit details -
[RFC][C++20][Modules] Fix crash when function and lambda inside loade…
…d from different modules (llvm#104512) Summary: Because AST loading code is lazy and happens in unpredictable order it could happen that function and lambda inside function can be loaded from different modules. In this case, captured DeclRefExpr won’t match the corresponding VarDecl inside function. In AST it looks like this: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff changes AST deserialization to load lambdas inside canonical function declaration earlier right after the function to make sure that their canonical decl is loaded from the same module. Test Plan: check-clang
Configuration menu - View commit details
-
Copy full SHA for d778689 - Browse repository at this point
Copy the full SHA d778689View commit details -
Configuration menu - View commit details
-
Copy full SHA for bf68403 - Browse repository at this point
Copy the full SHA bf68403View commit details -
Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (ll…
…vm#90074) llvm#108037 (llvm#108047) The previous `attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037` was incomplete because the `ImmutableModuleSummaryIndexWrapperPass` is now optional for the MachineOutliner pass. With this fix, the test file `CodeGen/AArch64/O3-pipeline.ll` shows no changes compared to its state before `[CGData][MachineOutliner] Global Outlining (llvm#90074)`. Co-authored-by: Kyungwoo Lee <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ba2aa1d - Browse repository at this point
Copy the full SHA ba2aa1dView commit details -
Fix for llvm/test/CodeGen/RISCV/O3-pipeline.ll (llvm#108050)
The previous `Fix for Attempt to fix [CGData][MachineOutliner] Global Outlining (llvm#90074) llvm#108037 (llvm#108047)` somehow dropped this file.
Configuration menu - View commit details
-
Copy full SHA for 2cfdcfb - Browse repository at this point
Copy the full SHA 2cfdcfbView commit details -
[RISCV] Separate more of scalar FP in CC_RISCV. NFC (llvm#107908)
Scalar FP calling convention has gotten more complicated with recent changes to Zfinx/Zdinx, proposed addition of a GPRF16 register class, and using customReg for f16/bf16 and other FP types small than XLen. The previous code tried to share a single getReg and getMem call for many different cases. This patch separates all the FP register handling to the top of the function with their own getReg calls. The only exception is f64 with XLen==32, when we are out of FPRs or not able to use FPRs due to ABI. The way I've structured this, we no longer need to correct the LocVT for FP back to ValVT before the call to getMem.
Configuration menu - View commit details
-
Copy full SHA for 14b4356 - Browse repository at this point
Copy the full SHA 14b4356View commit details -
Revert "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (llvm#108054
) Breaks bots, see llvm#105822. Reverts llvm#105822
Configuration menu - View commit details
-
Copy full SHA for c7a7767 - Browse repository at this point
Copy the full SHA c7a7767View commit details -
[LLDB][Data Formatters] Calculate average and total time for summary …
…providers within lldb (llvm#102708) This PR adds a statistics provider cache, which allows an individual target to keep a rolling tally of it's total time and number of invocations for a given summary provider. This information is then available in statistics dump to help slow summary providers, and gleam more into insight into LLDB's time use.
Configuration menu - View commit details
-
Copy full SHA for 22144e2 - Browse repository at this point
Copy the full SHA 22144e2View commit details -
[libc] fix locale dependency for stdlib (llvm#108042)
Address the following issue: ``` ❯ ninja libc.test.src.__support.OSUtil.linux.vdso_test.__unit__ [91/127] Building CXX object libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o FAILED: libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o sccache /usr/bin/clang++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/schrodingerzy/Documents/llvm-project/libc -isystem /home/schrodingerzy/Documents/llvm-project/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -std=gnu++17 -fpie -DLIBC_FULL_BUILD -ffreestanding -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -MD -MT libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -MF libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o.d -o libc/test/src/__support/OSUtil/linux/CMakeFiles/libc.test.src.__support.OSUtil.linux.vdso_test.__unit__.__build__.dir/vdso_test.cpp.o -c /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/src/__support/OSUtil/linux/vdso_test.cpp:21: In file included from /home/schrodingerzy/Documents/llvm-project/libc/test/UnitTest/ErrnoSetterMatcher.h:13: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/FPUtil/fpbits_str.h:12: In file included from /home/schrodingerzy/Documents/llvm-project/libc/src/__support/CPP/string.h:20: /home/schrodingerzy/Documents/llvm-project/build/libc/include/stdlib.h:13:10: fatal error: 'llvm-libc-types/locale_t.h' file not found 13 | #include "llvm-libc-types/locale_t.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. [123/127] Building CXX object libc/test/UnitTest/CMakeFiles/LibcTest.unit.dir/LibcTestMain.cpp.o ninja: build stopped: subcommand failed. ```
Configuration menu - View commit details
-
Copy full SHA for ce9f987 - Browse repository at this point
Copy the full SHA ce9f987View commit details -
[MemProf] Streamline and avoid unnecessary context id duplication (ll…
…vm#107918) Sort the list of calls such that those with the same stack ids are also sorted by function. This allows processing of all matching calls (that can share a context node) in bulk as they are all adjacent. This has 2 benefits: 1. It reduces unnecessary work, specifically the handling to intersect the context ids with those along the graph edges for the stack ids, for calls that we know can share a node. 2. It simplifies detecting when we have matching stack ids but don't need to duplicate context ids. Specifically, we were previously still duplicating context ids whenever we saw another call with the same stack ids, but that isn't necessary if they will share a context node. With this change we now only duplicate context ids if we see some that not only have the same ids but also are in different functions. This change reduced the amount of context id duplication and provided reductions in both both peak memory (~8%) and time (~%5) for a large target.
Configuration menu - View commit details
-
Copy full SHA for 524a028 - Browse repository at this point
Copy the full SHA 524a028View commit details -
[ADT] Require base equality in indexed_accessor_iterator::operator==() (
llvm#107856) Similarly to operator<(), equality-comparing iterators from different ranges must really be forbidden. The preconditions for being able to do `it1 < it2` and `it1 != it2` (or `it1 == it2` for the matter) ought to be the same. Thus, there's little sense in keeping explicit base object comparison in operator==() whilst having this is a precondition in operator<() and operator-() (e.g. used for std::distance() and such).
Configuration menu - View commit details
-
Copy full SHA for 7fb19cb - Browse repository at this point
Copy the full SHA 7fb19cbView commit details -
[DirectX] Lower
@llvm.dx.typedBufferStore
to DXIL opsThe `@llvm.dx.typedBufferStore` intrinsic is lowered to `@dx.op.bufferStore`. Pull Request: llvm#104253
Configuration menu - View commit details
-
Copy full SHA for 90e8411 - Browse repository at this point
Copy the full SHA 90e8411View commit details -
Configuration menu - View commit details
-
Copy full SHA for c8ed2b8 - Browse repository at this point
Copy the full SHA c8ed2b8View commit details -
[PowerPC] Fix assert exposed by PR 95931 in LowerBITCAST (llvm#108062)
Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" Fix by checking opcode and value type before calling getOperand.
Configuration menu - View commit details
-
Copy full SHA for 22067a8 - Browse repository at this point
Copy the full SHA 22067a8View commit details -
Revert "[NVPTX] Support copysign PTX instruction (llvm#107800)" (llvm…
Configuration menu - View commit details
-
Copy full SHA for 02c943a - Browse repository at this point
Copy the full SHA 02c943aView commit details -
Add DIExpression::foldConstantMath to CoroSplit (llvm#107933)
The CoroSplit pass has it's own salvageDebugInfo implementation and it's DIExpressions do not get folded. Add a call to DIExpression::foldConstantMath in the CoroSplit pass to reduce the size of those DIExpressions. [The compile time tracker shows no significant increase in compile time either.](https://llvm-compile-time-tracker.com/compare.php?from=bdf02249e7f8f95177ff58c881caf219699acb98&to=e1c1c1759c06bc4c42f79eebdb0e3cd45219cef4&stat=instructions:u) rdar://134675402
Configuration menu - View commit details
-
Copy full SHA for 7a91af4 - Browse repository at this point
Copy the full SHA 7a91af4View commit details -
Configuration menu - View commit details
-
Copy full SHA for feeb6aa - Browse repository at this point
Copy the full SHA feeb6aaView commit details -
[RISCV] Fix fneg.d/fabs.d aliasing handling for Zdinx. Add missing fm…
…v.s/d aliases. We were missing test coverage for fneg.d/fabs.d for Zdinx. When I added it revealed it only worked on RV64. The assembler was not creating a GPRPair register class on RV32 so the alias couldn't match. The disassembler was also not using GPRPair registers preventing the aliases from printing in disassembly too. I've fixed the assembler by adding new parsing methods in an attempt to get decent diagnostics. This is hard since the mnemonics are ambiguous between D and Zdinx. Tests have been adjusted for some differences in what errors are reported first.
Configuration menu - View commit details
-
Copy full SHA for 5537ae8 - Browse repository at this point
Copy the full SHA 5537ae8View commit details -
[lldb-dap] Improve
stackTrace
andexceptionInfo
DAP request handl……ers (llvm#105905) Refactoring `stackTrace` to perform frame look ups in a more on-demand fashion to improve overall performance. Additionally adding additional information to the `exceptionInfo` request to report exception stacks there instead of merging the exception stack into the stack trace. The `exceptionInfo` request is only called if a stop event occurs with `reason='exception'`, which should mitigate the performance of `SBThread::GetCurrentException` calls. Adding unit tests for exception handling and stack trace supporting.
Configuration menu - View commit details
-
Copy full SHA for 5b4100c - Browse repository at this point
Copy the full SHA 5b4100cView commit details -
[DirectX] Add DirectXTargetCodeGenInfo (llvm#104856)
Adds target codegen info class for DirectX. For now it always translates `__hlsl_resource_t` handle to `target("dx.TypedBuffer", i32, 1, 0, 1)` (`RWBuffer<int>`). More work is needed to determine the actual target exp type and parameters based on the resource handle attributes. Part 1/2 of llvm#95952
Configuration menu - View commit details
-
Copy full SHA for becb03f - Browse repository at this point
Copy the full SHA becb03fView commit details -
[Coroutines] Move spill related methods to a Spill utils (llvm#107884)
* Move code related to spilling into SpillUtils to help cleanup CoroFrame See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057
Configuration menu - View commit details
-
Copy full SHA for f4e2d7b - Browse repository at this point
Copy the full SHA f4e2d7bView commit details -
[Coroutines] Split buildCoroutineFrame
* Split buildCoroutineFrame into code related to normalization and code related to actually building the coroutine frame. * This will enable future specialization of buildCoroutineFrame for different ABIs.
tnowicki committedSep 10, 2024 Configuration menu - View commit details
-
Copy full SHA for 34559ad - Browse repository at this point
Copy the full SHA 34559adView commit details