Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Coroutines] Support Custom ABIs and plugin libraries #7

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
56d2c62
[SandboxVec][Interval] Add print() and dump()
vporpo Oct 8, 2024
0c0ec04
[gn build] Port 56d2c626f75e
llvmgnsyncbot Oct 8, 2024
e5fae76
[SandboxVectorizer] Add MemSeed bundle types (#111584)
Sterling-Augustine Oct 8, 2024
a8eb12c
[compiler-rt] Reapply freadlink interception for macOs. (#110917)
devnexen Oct 8, 2024
87b491a
[NFC] [MTE] get rid of unnecessary cast (#110336)
fmayer Oct 8, 2024
5f36042
[NFC] [HWASan] [MTE] factor out threadlong increment (#110340)
fmayer Oct 8, 2024
1a19313
[RISC-V][HWASAN] Fix incorrect comments (#103728)
SiFiveHolland Oct 8, 2024
4cab01f
[BOLT] Profile quality stats -- CFG discontinuity (#109683)
ShatianWang Oct 8, 2024
a85eb34
[gn build] Port 4cab01f07262
llvmgnsyncbot Oct 8, 2024
0e86e52
[BOLT][AArch64] Reduce the number of ADR relaxations (#111577)
maksfb Oct 8, 2024
04a8bff
[SandboxVec][DAG] Build actual dependencies (#111094)
vporpo Oct 8, 2024
aabddc9
[MLIR][memref] Fix normalization issue in memref.load (#107771)
DarshanRamakant Oct 9, 2024
c80f484
[lldb][NFC] Fix a build failure with MSVC (#111231)
igorkudrin Oct 9, 2024
ff6faca
[clang] remove extra space in warn_atomic_op_oversized (NFC) (#110955)
Enna1 Oct 9, 2024
1818404
[LiveDebugValues][NVPTX]VarLocBasedImpl handle vregs, enable for NVPT…
willghatch Oct 9, 2024
64a22b3
[NVPTX] fix debug register encoding of special %Depot register (#111596)
willghatch Oct 9, 2024
bb8df02
[RISCV] Use the MCStreamer reference passed to RISCVAsmPrinter::EmitT…
topperc Oct 9, 2024
267e852
[SandboxVec][DAG][NFC] Rename enumerators
vporpo Oct 9, 2024
1e81056
[Coroutines] Avoid repeated hash lookups (NFC) (#111617)
kazutakahirata Oct 9, 2024
2d8cd32
[InstCombine] Avoid repeated hash lookups (NFC) (#111618)
kazutakahirata Oct 9, 2024
0ee5c86
[mlir][spirv] Avoid repeated hash lookups (NFC) (#111619)
kazutakahirata Oct 9, 2024
a579782
[llvm] Add serialization to uint32_t for FixedPointSemantics (#110288)
tbaederr Oct 9, 2024
1809d0f
[clang-format] Insert a space between l_paren and ref-qualifier (#111…
owenca Oct 9, 2024
d0b9c2c
[compiler-rt] Remove SHA2 interceptions for NetBSD/FreeBSD. (#110246)
devnexen Oct 9, 2024
d50302f
clang/AMDGPU: Stop emitting amdgpu-unsafe-fp-atomics attribute (#111579)
arsenm Oct 9, 2024
4336f00
[clang] Track function template instantiation from definition (#110387)
mizvekov Oct 9, 2024
fbd2a91
InferAddressSpaces: Handle llvm.fake.use (#109567)
arsenm Oct 9, 2024
c198f77
AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642)
arsenm Oct 9, 2024
3dba4ca
[ORC][MachO] Remove the ExecutionSession& argument to MachOPlatform c…
lhames Oct 9, 2024
55dd29c
[llvm-profdata] Avoid repeated hash lookups (NFC) (#111629)
kazutakahirata Oct 9, 2024
b26aac5
[sanitizer] Report -> VReport for ThreadLister failure
vitalybuka Oct 9, 2024
a06591b
[libc++][type_traits] P2674R1: A trait for implicit lifetime types (#…
H-G-Hristov Oct 9, 2024
3c1d9b8
[gn build] Port a06591b4d4fb
llvmgnsyncbot Oct 9, 2024
fb2960a
[compiler-rt] [profile] Add missing (void) to prototypes, for C sourc…
mstorsjo Oct 9, 2024
3be6916
Add symbol visibility macros to abi-breaking.h.cmake (#110898)
fsfod Oct 9, 2024
ada6372
Revert "[clang] Finish implementation of P0522 (#96023)"
zmodem Oct 7, 2024
3bf33ec
[GlobalISel] Fold bitcast(undef) to undef. (#111491)
davemgreen Oct 9, 2024
e2dc50c
[docs] Update the libc++ documentation link
philnik777 Oct 9, 2024
275a2b0
[MLIR][Tensor] Perform shape inference via in-place modification (NFC…
joker-eph Oct 9, 2024
fed8695
[clang][bytecode] Emit better diagnostic for invalid shufflevector in…
tbaederr Oct 9, 2024
b9314a8
[mlir][spirv] Update math.powf lowering (#111388)
d-smirnov Oct 9, 2024
ef739e7
[clang] Change "bad" to "unsupported" in register type error (#111550)
DavidSpickett Oct 9, 2024
a4de127
[libclc] Give a helpful error when an unknown target is requested (#1…
DavidSpickett Oct 9, 2024
de4f2c9
[lldb][test] Enable TestDAP_runInTerminal on non-x86 architectures (#…
DavidSpickett Oct 9, 2024
5be1024
[ci] Use check-compiler-rt target for testing compiler-rt (#111515)
DavidSpickett Oct 9, 2024
10008f7
[ci] Don't add a testing target for libclc (#111547)
DavidSpickett Oct 9, 2024
587f31f
[InstCombine] Add a test for converting log to an intrinsic. NFC
davemgreen Oct 9, 2024
e080be5
[NFC][LoopVectorize] Clean up some code around getting a context (#11…
david-arm Oct 9, 2024
67200f5
[ARM] Tidy up stack frame strategy code (NFC) (#110283)
ostannard Sep 25, 2024
2ecf2e2
[ARM] Factor out code to determine spill areas (NFC) (#110283)
ostannard Sep 25, 2024
e817cfd
[ARM] Refactor generation of push/pop instructions (NFC) (#110283)
ostannard Sep 27, 2024
754c1f2
[ARM] Add debug dump for StackAdjustingInsts (NFC) (#110283)
ostannard Sep 27, 2024
baa1fc9
[ARM] Remove always-true checks from Thumb1 frame lowering (NFC) (#11…
ostannard Sep 27, 2024
6004f55
[ADT][APFloat] Make sure EBO is performed on APFloat (#111641)
dtcxzyw Oct 9, 2024
146d3f0
[lldb][test] Disable TestSharedLibStrippedSymbols on Arm
DavidSpickett Oct 9, 2024
b43e003
Revert "[lldb][test] Enable TestDAP_runInTerminal on non-x86 architec…
DavidSpickett Oct 9, 2024
a1bc3e6
[ARMAsmBackend] Add checks for relocation addends in assembler (#109969)
jcohen-apple Oct 9, 2024
f016e10
[bazel] update config.h.cmake
metaflow Oct 9, 2024
1a1de24
[bazel] update abi-breaking.h.cmake for 3be691651a2143f23bcf8f2704e55…
metaflow Oct 9, 2024
cc99bdd
AMDGPU: Avoid using hardcoded address space number
arsenm Oct 9, 2024
b124c04
[Flang][OpenMP] Remove omp.simd reduction block args (#111523)
skatrak Oct 9, 2024
e71ac93
[Flang][OpenMP] Properly reserve space for entry block argument lists…
skatrak Oct 9, 2024
6472cb1
[FuncSpec] Improve estimation of select instruction. (#111176)
labrinea Oct 9, 2024
af933f0
[clang][x86] Missing `AVX512VP2INTERSECT` flag (#111435)
ashvardanian Oct 9, 2024
b2edeb5
[openmp] Add option to disable tsan tests (#111548)
nikic Oct 9, 2024
068d76b
[analyzer] Fix crash when casting the result of a malformed fptr call…
steakhal Oct 9, 2024
1be64e5
[clang][Sema] Add instant event when template instantiation is deferr…
ivanaivanovska Oct 9, 2024
671cbcf
AMDGPU: Add baseline tests for gep flag handling (#110814)
arsenm Oct 9, 2024
ced15cd
DAG: Preserve more flags when expanding gep (#110815)
arsenm Oct 9, 2024
886d98e
[LLVM][AArch64] Enable SVEIntrinsicOpts at all optimisation levels.
paulwalker-arm Oct 8, 2024
1b3fc75
Revert "[LLVM][AArch64] Enable SVEIntrinsicOpts at all optimisation l…
paulwalker-arm Oct 9, 2024
00c1c58
DependencyGraph.cpp - mix MSVC "not all control paths return a value"…
RKSimon Oct 9, 2024
a9f5a44
[X86] Regenerate test checks with vpternlog comments
RKSimon Oct 9, 2024
374fffe
Fix out-of-bounds access to std::unique_ptr<T[]> (#111581)
alexfh Oct 9, 2024
01cbbc5
[VPlan] Request lane 0 for pointer arg in PtrAdd.
fhahn Oct 9, 2024
25c3ecf
[X86] Add isConstantPowerOf2 helper to replace repeated code. NFC.
RKSimon Oct 9, 2024
e17f701
[X86] vselect-pcmp.ll - regenerate test checks with vpternlog comments
RKSimon Oct 9, 2024
4b4078a
[X86] Add test coverage for #110875
RKSimon Oct 9, 2024
8e2ccdc
[MLIR][LLVM] Use ViewLikeOpInterface (#111663)
gysit Oct 9, 2024
3b2bfb4
[mlir] add missing CMake dependency on ShardingInterface generated he…
Zhang-Zecheng Oct 9, 2024
3b7091b
[APFloat] add predicates to fltSemantics for hasZero and hasSignedRep…
Ariel-Burton Oct 9, 2024
890e481
AMDGPU: Regenerate test checks
arsenm Oct 9, 2024
21da4e7
[libc++] Fix broken configuration system-libcxxabi on Apple (#110920)
ldionne Oct 9, 2024
32db6fb
[mlir][vector] Implement speculation for vector.transferx ops (#111533)
Groverkss Oct 9, 2024
5b03efb
[Clang][OpenMP] Add permutation clause (#92030)
Meinersbur Oct 9, 2024
fa3258e
[VPlan] Sink retrieving legacy costs to more specific computeCost imp…
fhahn Oct 9, 2024
c47f3e8
[X86] combineSelect - Fold select(pcmpeq(and(X,Pow2),0),A,B) -> selec…
RKSimon Oct 9, 2024
15dc2d5
[IR] Prevent implicit SymbolTableListTraits template instantiation (#…
vmustya Oct 9, 2024
d25f1a1
Add 64bit atomic check in the is_always_lock_free_pass test. (#111540)
simpal01 Oct 9, 2024
1e357cd
AMDGPU: Use pointer types more consistently (#111651)
arsenm Oct 9, 2024
a9ebdbb
[MLIR] Vector: turn the ExtractStridedSlice rewrite pattern from #111…
bjacob Oct 9, 2024
390943f
[flang] Implement conversion of compatible derived types (#111165)
luporl Oct 9, 2024
6f8e855
[clang][bytecode] Implement __builtin_ai32_addcarryx* (#111671)
tbaederr Oct 9, 2024
7d9f993
[Transform] Avoid repeated hash lookups (NFC) (#111620)
kazutakahirata Oct 9, 2024
48e4d67
[DSE] Simplify code with MapVector::operator[] (NFC) (#111621)
kazutakahirata Oct 9, 2024
bda4fc0
[NVPTX] Avoid repeated map lookups (NFC) (#111627)
kazutakahirata Oct 9, 2024
1ad5f31
[Clang] Avoid a crash when parsing an invalid pseudo-destructor (#111…
cor3ntin Oct 9, 2024
c911b0a
[clang-tidy] Avoid repeated hash lookups (NFC) (#111628)
kazutakahirata Oct 9, 2024
01a0e85
[Conversion] Avoid repeated hash lookups (NFC) (#111637)
kazutakahirata Oct 9, 2024
f59b151
[bazel] port 8e2ccdc4deedd463a20237b4d842b4c51f9fe603
metaflow Oct 9, 2024
e85fcb7
AMDGPU: Add instruction flags when lowering ctor/dtor (#111652)
arsenm Oct 9, 2024
6654578
[LLVM][AArch64] Enable SVEIntrinsicOpts at all optimisation levels.
paulwalker-arm Oct 8, 2024
c4d288d
[flang][OpenMP] Don't check unlabelled `cycle` branching for target l…
ergawy Oct 9, 2024
1731bb7
llvm-reduce: Fix not checking shouldKeep in special-globals reduction…
arsenm Oct 9, 2024
e637a5c
[clang][bytecode] Only allow lossless ptr-to-int casts (#111669)
tbaederr Oct 9, 2024
72a957b
[Cuda] Handle -fcuda-short-ptr even with -nocudalib (#111682)
frasercrmck Oct 9, 2024
c136d32
[VectorCombine] Do not try to operate on OperandBundles. (#111635)
davemgreen Oct 9, 2024
d905a3c
[NFC] Format MachineVerifier.cpp to remove extra indentation (#111602)
ellishg Oct 9, 2024
774893d
[mlir][ROCDL] Plumb through AMDGPU memory access metadata (#110916)
krzysz00 Oct 9, 2024
18952bd
[gn build] Fix up win/x86 flags and add stage2_unix_x86 (#111595)
aeubanks Oct 9, 2024
2e47b93
[ARM] Honour -mno-movt in stack protector handling (#109022)
ardbiesheuvel Oct 9, 2024
cf5bbeb
[gn build] Remove unix x86 stage2 toolchain
aeubanks Oct 9, 2024
1553cb5
[Sema] Support negation/parens with __builtin_available (#111439)
gburgessiv Oct 9, 2024
17bc959
[AMDGPU] Optionally Use GCNRPTrackers during scheduling (#93090)
jrbyrnes Oct 9, 2024
ec450b1
[mlir][xegpu] Allow out-of-bounds writes (#110811)
adam-smnk Oct 9, 2024
18d655f
[SimplifyCFG][NFC] Improve compile time for TryToSimplifyUncondBranch…
aemerson Oct 9, 2024
13cd43a
[Clang][OpenMP] Do not use feature option during packaging (#111702)
saiislam Oct 9, 2024
3a08551
[AMDGPU] Fix expensive check
jrbyrnes Oct 9, 2024
4e33afe
[libc][math] Implement `issignaling` and `iscanonical` macro. (#111403)
Sh0g0-1758 Oct 9, 2024
ee0e17a
[SandboxVec][DAG] Drop RAR and fix dependency scanning loop (#111715)
vporpo Oct 9, 2024
10ada4a
[SandboxVectorizer] Use sbvec-passes flag to create a pipeline of Reg…
slackito Oct 9, 2024
a075e78
AMDGPU: Fix incorrectly selecting fp8/bf8 conversion intrinsics (#107…
arsenm Oct 9, 2024
dc09f96
[test] remove profile file at the start of profile/instrprof-write-fi…
Oct 9, 2024
102c384
Revert "[SandboxVectorizer] Use sbvec-passes flag to create a pipelin…
slackito Oct 9, 2024
208584d
[clang][bytecode] Fix source range of uncalled base dtor (#111683)
tbaederr Oct 9, 2024
1bb52e9
[CIR] Build out AST consumer patterns to reach the entry point into C…
lanza Oct 9, 2024
1cfe5b8
[lldb] Use SEND_ERROR instead of FATAL_ERROR in test/CMakeLists.txt (…
JDevlieghere Oct 9, 2024
e82fcda
[Coroutines] Move util headers to include/llvm (#111599)
TylerNowicki Oct 9, 2024
ab72637
[Coroutines] Support Custom ABIs and plugin libraries
Sep 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .ci/generate-buildkite-pipeline-premerge
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ function check-targets() {
echo "check-clang-tools"
;;
compiler-rt)
echo "check-all"
echo "check-compiler-rt"
;;
cross-project-tests)
echo "check-cross-project"
Expand All @@ -219,7 +219,7 @@ function check-targets() {
echo "check-all"
;;
libclc)
echo "check-all"
# Currently there is no testing for libclc.
;;
*)
echo "check-${project}"
Expand Down
61 changes: 61 additions & 0 deletions bolt/include/bolt/Passes/ContinuityStats.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
//===- bolt/Passes/ContinuityStats.h ----------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This pass checks how well the BOLT input profile satisfies the following
// "CFG continuity" property of a perfect profile:
//
// Each positive-execution-count block in the function’s CFG
// should be *reachable* from a positive-execution-count function
// entry block through a positive-execution-count path.
//
// More specifically, for each of the hottest 1000 functions, the pass
// calculates the function’s fraction of basic block execution counts
// that is *unreachable*. It then reports the 95th percentile of the
// distribution of the 1000 unreachable fractions in a single BOLT-INFO line.
// The smaller the reported value is, the better the BOLT profile
// satisfies the CFG continuity property.

// The default value of 1000 above can be changed via the hidden BOLT option
// `-num-functions-for-continuity-check=[N]`.
// If more detailed stats are needed, `-v=1` can be used: the hottest N
// functions will be grouped into 5 equally-sized buckets, from the hottest
// to the coldest; for each bucket, various summary statistics of the
// distribution of the unreachable fractions and the raw unreachable execution
// counts will be reported.
//
//===----------------------------------------------------------------------===//

#ifndef BOLT_PASSES_CONTINUITYSTATS_H
#define BOLT_PASSES_CONTINUITYSTATS_H

#include "bolt/Passes/BinaryPasses.h"
#include <vector>

namespace llvm {

class raw_ostream;

namespace bolt {
class BinaryContext;

/// Compute and report to the user the function CFG continuity quality
class PrintContinuityStats : public BinaryFunctionPass {
public:
explicit PrintContinuityStats(const cl::opt<bool> &PrintPass)
: BinaryFunctionPass(PrintPass) {}

bool shouldOptimize(const BinaryFunction &BF) const override;
const char *getName() const override { return "continuity-stats"; }
bool shouldPrint(const BinaryFunction &) const override { return false; }
Error runOnFunctions(BinaryContext &BC) override;
};

} // namespace bolt
} // namespace llvm

#endif // BOLT_PASSES_CONTINUITYSTATS_H
7 changes: 4 additions & 3 deletions bolt/lib/Passes/ADRRelaxationPass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,14 @@ void ADRRelaxationPass::runOnFunction(BinaryFunction &BF) {
continue;
}

// Don't relax adr if it points to the same function and it is not split
// and BF initial size is < 1MB.
// Don't relax ADR if it points to the same function and is in the main
// fragment and BF initial size is < 1MB.
const unsigned OneMB = 0x100000;
if (BF.getSize() < OneMB) {
BinaryFunction *TargetBF = BC.getFunctionForSymbol(Symbol);
if (TargetBF == &BF && !BF.isSplit())
if (TargetBF == &BF && !BB.isSplit())
continue;

// No relaxation needed if ADR references a basic block in the same
// fragment.
if (BinaryBasicBlock *TargetBB = BF.getBasicBlockForLabel(Symbol))
Expand Down
1 change: 1 addition & 0 deletions bolt/lib/Passes/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ add_llvm_library(LLVMBOLTPasses
PatchEntries.cpp
PettisAndHansen.cpp
PLTCall.cpp
ContinuityStats.cpp
RegAnalysis.cpp
RegReAssign.cpp
ReorderAlgorithm.cpp
Expand Down
250 changes: 250 additions & 0 deletions bolt/lib/Passes/ContinuityStats.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
//===- bolt/Passes/ContinuityStats.cpp --------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file implements the continuity stats calculation pass.
//
//===----------------------------------------------------------------------===//

#include "bolt/Passes/ContinuityStats.h"
#include "bolt/Core/BinaryBasicBlock.h"
#include "bolt/Core/BinaryFunction.h"
#include "bolt/Utils/CommandLineOpts.h"
#include "llvm/Support/CommandLine.h"
#include <queue>
#include <unordered_map>
#include <unordered_set>

#define DEBUG_TYPE "bolt-opts"

using namespace llvm;
using namespace bolt;

namespace opts {
extern cl::opt<unsigned> Verbosity;
cl::opt<unsigned> NumFunctionsForContinuityCheck(
"num-functions-for-continuity-check",
cl::desc("number of hottest functions to print aggregated "
"CFG discontinuity stats of."),
cl::init(1000), cl::ZeroOrMore, cl::Hidden, cl::cat(BoltOptCategory));
} // namespace opts

namespace {
using FunctionListType = std::vector<const BinaryFunction *>;
using function_iterator = FunctionListType::iterator;

template <typename T>
void printDistribution(raw_ostream &OS, std::vector<T> &values,
bool Fraction = false) {
if (values.empty())
return;
// Sort values from largest to smallest and print the MAX, TOP 1%, 5%, 10%,
// 20%, 50%, 80%, MIN. If Fraction is true, then values are printed as
// fractions instead of integers.
std::sort(values.begin(), values.end());

auto printLine = [&](std::string Text, double Percent) {
int Rank = int(values.size() * (1.0 - Percent / 100));
if (Percent == 0)
Rank = values.size() - 1;
if (Fraction)
OS << " " << Text << std::string(9 - Text.length(), ' ') << ": "
<< format("%.2lf%%", values[Rank] * 100) << "\n";
else
OS << " " << Text << std::string(9 - Text.length(), ' ') << ": "
<< values[Rank] << "\n";
};

printLine("MAX", 0);
const int percentages[] = {1, 5, 10, 20, 50, 80};
for (size_t i = 0; i < sizeof(percentages) / sizeof(percentages[0]); ++i) {
printLine("TOP " + std::to_string(percentages[i]) + "%", percentages[i]);
}
printLine("MIN", 100);
}

void printCFGContinuityStats(raw_ostream &OS,
iterator_range<function_iterator> &Functions) {
// Given a perfect profile, every positive-execution-count BB should be
// connected to an entry of the function through a positive-execution-count
// directed path in the control flow graph.
std::vector<size_t> NumUnreachables;
std::vector<size_t> SumECUnreachables;
std::vector<double> FractionECUnreachables;

for (auto it = Functions.begin(); it != Functions.end(); ++it) {
const BinaryFunction *Function = *it;
if (Function->size() <= 1)
continue;

// Compute the sum of all BB execution counts (ECs).
size_t NumPosECBBs = 0;
size_t SumAllBBEC = 0;
for (const BinaryBasicBlock &BB : *Function) {
const size_t BBEC = BB.getKnownExecutionCount();
NumPosECBBs += BBEC > 0 ? 1 : 0;
SumAllBBEC += BBEC;
}

// Perform BFS on subgraph of CFG induced by positive weight edges.
// Compute the number of BBs reachable from the entry(s) of the function and
// the sum of their execution counts (ECs).
std::unordered_map<unsigned, const BinaryBasicBlock *> IndexToBB;
std::unordered_set<unsigned> Visited;
std::queue<unsigned> Queue;
for (const BinaryBasicBlock &BB : *Function) {
// Make sure BB.getIndex() is not already in IndexToBB.
assert(IndexToBB.find(BB.getIndex()) == IndexToBB.end());
IndexToBB[BB.getIndex()] = &BB;
if (BB.isEntryPoint() && BB.getKnownExecutionCount() > 0) {
Queue.push(BB.getIndex());
Visited.insert(BB.getIndex());
}
}
while (!Queue.empty()) {
const unsigned BBIndex = Queue.front();
const BinaryBasicBlock *BB = IndexToBB[BBIndex];
Queue.pop();
auto SuccBIIter = BB->branch_info_begin();
for (const BinaryBasicBlock *Succ : BB->successors()) {
const uint64_t Count = SuccBIIter->Count;
if (Count == BinaryBasicBlock::COUNT_NO_PROFILE || Count == 0) {
++SuccBIIter;
continue;
}
if (!Visited.insert(Succ->getIndex()).second) {
++SuccBIIter;
continue;
}
Queue.push(Succ->getIndex());
++SuccBIIter;
}
}

const size_t NumReachableBBs = Visited.size();

// Loop through Visited, and sum the corresponding BBs' execution counts
// (ECs).
size_t SumReachableBBEC = 0;
for (const unsigned BBIndex : Visited) {
const BinaryBasicBlock *BB = IndexToBB[BBIndex];
SumReachableBBEC += BB->getKnownExecutionCount();
}

const size_t NumPosECBBsUnreachableFromEntry =
NumPosECBBs - NumReachableBBs;
const size_t SumUnreachableBBEC = SumAllBBEC - SumReachableBBEC;
const double FractionECUnreachable =
(double)SumUnreachableBBEC / SumAllBBEC;

if (opts::Verbosity >= 2 && FractionECUnreachable >= 0.05) {
OS << "Non-trivial CFG discontinuity observed in function "
<< Function->getPrintName() << "\n";
LLVM_DEBUG(Function->dump());
}

NumUnreachables.push_back(NumPosECBBsUnreachableFromEntry);
SumECUnreachables.push_back(SumUnreachableBBEC);
FractionECUnreachables.push_back(FractionECUnreachable);
}

if (FractionECUnreachables.empty())
return;

std::sort(FractionECUnreachables.begin(), FractionECUnreachables.end());
const int Rank = int(FractionECUnreachables.size() * 0.95);
OS << format("top 5%% function CFG discontinuity is %.2lf%%\n",
FractionECUnreachables[Rank] * 100);

if (opts::Verbosity >= 1) {
OS << "abbreviations: EC = execution count, POS BBs = positive EC BBs\n"
<< "distribution of NUM(unreachable POS BBs) among all focal "
"functions\n";
printDistribution(OS, NumUnreachables);

OS << "distribution of SUM_EC(unreachable POS BBs) among all focal "
"functions\n";
printDistribution(OS, SumECUnreachables);

OS << "distribution of [(SUM_EC(unreachable POS BBs) / SUM_EC(all "
"POS BBs))] among all focal functions\n";
printDistribution(OS, FractionECUnreachables, /*Fraction=*/true);
}
}

void printAll(BinaryContext &BC, FunctionListType &ValidFunctions,
size_t NumTopFunctions) {
// Sort the list of functions by execution counts (reverse).
llvm::sort(ValidFunctions,
[&](const BinaryFunction *A, const BinaryFunction *B) {
return A->getKnownExecutionCount() > B->getKnownExecutionCount();
});

const size_t RealNumTopFunctions =
std::min(NumTopFunctions, ValidFunctions.size());

iterator_range<function_iterator> Functions(
ValidFunctions.begin(), ValidFunctions.begin() + RealNumTopFunctions);

BC.outs() << format("BOLT-INFO: among the hottest %zu functions ",
RealNumTopFunctions);
printCFGContinuityStats(BC.outs(), Functions);

// Print more detailed bucketed stats if requested.
if (opts::Verbosity >= 1 && RealNumTopFunctions >= 5) {
const size_t PerBucketSize = RealNumTopFunctions / 5;
BC.outs() << format(
"Detailed stats for 5 buckets, each with %zu functions:\n",
PerBucketSize);

// For each bucket, print the CFG continuity stats of the functions in the
// bucket.
for (size_t BucketIndex = 0; BucketIndex < 5; ++BucketIndex) {
const size_t StartIndex = BucketIndex * PerBucketSize;
const size_t EndIndex = StartIndex + PerBucketSize;
iterator_range<function_iterator> Functions(
ValidFunctions.begin() + StartIndex,
ValidFunctions.begin() + EndIndex);
const size_t MaxFunctionExecutionCount =
ValidFunctions[StartIndex]->getKnownExecutionCount();
const size_t MinFunctionExecutionCount =
ValidFunctions[EndIndex - 1]->getKnownExecutionCount();
BC.outs() << format("----------------\n| Bucket %zu: "
"|\n----------------\n",
BucketIndex + 1)
<< format(
"execution counts of the %zu functions in the bucket: "
"%zu-%zu\n",
EndIndex - StartIndex, MinFunctionExecutionCount,
MaxFunctionExecutionCount);
printCFGContinuityStats(BC.outs(), Functions);
}
}
}
} // namespace

bool PrintContinuityStats::shouldOptimize(const BinaryFunction &BF) const {
if (BF.empty() || !BF.hasValidProfile())
return false;

return BinaryFunctionPass::shouldOptimize(BF);
}

Error PrintContinuityStats::runOnFunctions(BinaryContext &BC) {
// Create a list of functions with valid profiles.
FunctionListType ValidFunctions;
for (const auto &BFI : BC.getBinaryFunctions()) {
const BinaryFunction *Function = &BFI.second;
if (PrintContinuityStats::shouldOptimize(*Function))
ValidFunctions.push_back(Function);
}
if (ValidFunctions.empty() || opts::NumFunctionsForContinuityCheck == 0)
return Error::success();

printAll(BC, ValidFunctions, opts::NumFunctionsForContinuityCheck);
return Error::success();
}
3 changes: 3 additions & 0 deletions bolt/lib/Rewrite/BinaryPassManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include "bolt/Passes/AllocCombiner.h"
#include "bolt/Passes/AsmDump.h"
#include "bolt/Passes/CMOVConversion.h"
#include "bolt/Passes/ContinuityStats.h"
#include "bolt/Passes/FixRISCVCallsPass.h"
#include "bolt/Passes/FixRelaxationPass.h"
#include "bolt/Passes/FrameOptimizer.h"
Expand Down Expand Up @@ -373,6 +374,8 @@ Error BinaryFunctionPassManager::runAllPasses(BinaryContext &BC) {
if (opts::PrintProfileStats)
Manager.registerPass(std::make_unique<PrintProfileStats>(NeverPrint));

Manager.registerPass(std::make_unique<PrintContinuityStats>(NeverPrint));

Manager.registerPass(std::make_unique<ValidateInternalCalls>(NeverPrint));

Manager.registerPass(std::make_unique<ValidateMemRefs>(NeverPrint));
Expand Down
Loading