Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix debug info size statistics for split dwarf #77671

Closed
wants to merge 2,520 commits into from

Conversation

jeffreytan81
Copy link
Contributor

statistics dump command relies on SymbolFile::GetDebugInfoSize() to get total debug info size.
The current implementation is missing debug info for split dwarf scenarios which requires getting debug info from separate dwo/dwp files.
This patch fixes this issue for split dwarf by parsing debug info from dwp/dwo.

@llvmbot
Copy link

llvmbot commented Jan 10, 2024

@llvm/pr-subscribers-lldb

Author: None (jeffreytan81)

Changes

statistics dump command relies on SymbolFile::GetDebugInfoSize() to get total debug info size.
The current implementation is missing debug info for split dwarf scenarios which requires getting debug info from separate dwo/dwp files.
This patch fixes this issue for split dwarf by parsing debug info from dwp/dwo.


Full diff: https://github.com/llvm/llvm-project/pull/77671.diff

4 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+23)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+2)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp (+11)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h (+2)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 1a16b70f42fe1f..61e8dd5e101c88 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -2669,6 +2669,29 @@ static bool UpdateCompilerContextForSimpleTemplateNames(TypeQuery &match) {
   }
   return any_context_updated;
 }
+
+uint64_t SymbolFileDWARF::GetDebugInfoSize() {
+  DWARFDebugInfo &info = DebugInfo();
+  uint32_t num_comp_units = info.GetNumUnits();
+
+  uint64_t debug_info_size = SymbolFileCommon::GetDebugInfoSize();
+  // In dwp scenario, debug info == skeleton debug info + dwp debug info.
+  if (std::shared_ptr<SymbolFileDWARFDwo> dwp_sp = GetDwpSymbolFile())
+    return debug_info_size + dwp_sp->GetDebugInfoSize();
+
+  // In dwo scenario, debug info == skeleton debug info + all dwo debug info.
+  for (uint32_t i = 0; i < num_comp_units; i++) {
+    DWARFUnit *cu = info.GetUnitAtIndex(i);
+    if (cu == nullptr)
+      continue;
+
+    SymbolFileDWARFDwo *dwo = cu->GetDwoSymbolFile();
+    if (dwo)
+      debug_info_size += dwo->GetDebugInfoSize();
+  }
+  return debug_info_size;
+}
+
 void SymbolFileDWARF::FindTypes(const TypeQuery &query, TypeResults &results) {
 
   // Make sure we haven't already searched this SymbolFile before.
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
index 26a9502f90aa00..6d87530acf833e 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
@@ -186,6 +186,8 @@ class SymbolFileDWARF : public SymbolFileCommon {
   GetMangledNamesForFunction(const std::string &scope_qualified_name,
                              std::vector<ConstString> &mangled_names) override;
 
+  uint64_t GetDebugInfoSize() override;
+
   void FindTypes(const lldb_private::TypeQuery &match,
                  lldb_private::TypeResults &results) override;
 
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
index ca698a84a9146d..b52cb514fb1907 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
@@ -85,6 +85,17 @@ lldb::offset_t SymbolFileDWARFDwo::GetVendorDWARFOpcodeSize(
   return GetBaseSymbolFile().GetVendorDWARFOpcodeSize(data, data_offset, op);
 }
 
+uint64_t SymbolFileDWARFDwo::GetDebugInfoSize() {
+  // Directly get debug info from current dwo object file's section list
+  // instead of asking SymbolFileCommon::GetDebugInfo() which parses from
+  // owning module which is wrong.
+  SectionList *section_list =
+      m_objfile_sp->GetSectionList(/*update_module_section_list=*/false);
+  if (section_list)
+    return section_list->GetDebugInfoSize();
+  return 0;
+}
+
 bool SymbolFileDWARFDwo::ParseVendorDWARFOpcode(
     uint8_t op, const lldb_private::DataExtractor &opcodes,
     lldb::offset_t &offset, std::vector<lldb_private::Value> &stack) const {
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h
index 9f5950e51b0c18..5c4b36328cbac1 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h
@@ -47,6 +47,8 @@ class SymbolFileDWARFDwo : public SymbolFileDWARF {
                                           const lldb::offset_t data_offset,
                                           const uint8_t op) const override;
 
+  uint64_t GetDebugInfoSize() override;
+
   bool ParseVendorDWARFOpcode(uint8_t op, const DataExtractor &opcodes,
                               lldb::offset_t &offset,
                               std::vector<Value> &stack) const override;

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, need to add tests for:

  • split dwarf with .dwo files
  • split dwarf with .dwp file

@jeffreytan81
Copy link
Contributor Author

Looks good, need to add tests for:

  • split dwarf with .dwo files
  • split dwarf with .dwp file

@clayborg, do you have suggestion regarding how to test GetDebugInfoSize() for dwo/dwp?

@clayborg
Copy link
Collaborator

Looks good, need to add tests for:

  • split dwarf with .dwo files
  • split dwarf with .dwp file

@clayborg, do you have suggestion regarding how to test GetDebugInfoSize() for dwo/dwp?

yes, for either we just add up the size of any debug info sections for the total and then make sure the right totals show up for the .dwo case or .dwp case. Both .dwo and .dwp look very similar, where the .dwp file just has a few more sections (the CU/TU index). And it would be good to verify we count all of the right sections.

H-G-Hristov and others added 20 commits January 30, 2024 09:25
…vm#79963)

I got tripped twice after: 7b46225
Let's at least mention these in the `Contributing.rst` doc.
The 'gang' clause takes a 'gang-arg-list', which is one of three 'tag'
values, followed by either an 'int-expr' or a 'size-expr', both of which
we already have parsing functions for.

The optional tag values are only slightly complicated, as one is a
keyword (static), so mild modifications needed to be made for that.
'arm64ecpe' was chosen arbitrarily as gcc MinGW doesn't provide EC
support.
There are assumptions of matching/consistent paths of execution under SPMD that allow to have pure collective communication operations.
Summary:
This test was added to test the generated header. Unfortunately this
doesn't work if the header is never generated. Add a check to make sure
the user has included it in the list of headers.
…rm, bang operators and numeric literals. (llvm#78996)

Adds the support for tokens that have forms like unary operators.
- bang operators:  `!name`
- cond operator: `!cond`
- numeric literals: `+1`, `-1`
cond operator are one of bang operators but is distinguished because it has very specific syntax.
)

There is currently no lowering out of `ml_program` in the LLVM
repository. This change adds a lowering to `memref` so that it can be
lowered all the way to LLVM. This lowering was taken from the [reference
backend in
torch-mlir](llvm/torch-mlir@f416953
).

I had tried implementing the `BufferizableOpInterface` for `ml_program`
instead of adding a new pass but that did not work because
`OneShotBufferize` does not visit module-level ops like
`ml_program.global`.
This folded casts into `memref.transpose` without updating the result
type of the transpose op, which resulted in IR that failed to verify for
statically sized memrefs.

i.e.

```mlir
%cast = memref.cast %0 : memref<?x4xf32> to memref<?x?xf32>
%transpose = memref.transpose %cast : memref<?x?xf32> to memref<?x?xf32>
```

would fold to:

```mlir
// Fails verification:
%transpose = memref.transpose %cast : memref<?x4xf32> to memref<?x?xf32>
```
This function is used in `jitlink-check` lines in LIT tests. In llvm#78371 I
missed to swap initial instruction bytes for systems that store the
constants as big-endian.
Currently cast from FP to int is implemented by clamping on the min and
max
integer values in the floating-point domain and then converting to
integer. However, the max int values are often non representable in the
floating-point input type due to lack of mantissa bits.

This patch instead use a select acting on a compare against max int + 1
which is representable in floating-point. It also has a special lowering
for cases where the integer range is wider than the floating-point range
to clamp the infinite values.
After llvm#75103, `MLPrgramTransforms` depends on `BufferizationDialect`.
Also fix an unrelated compile error in `GreedyPatternRewriteDriver.cpp`.
(This was not failing on CI. I may be running an old compiler locally.)
To build libc docs:
- Configure with `-DLLVM_ENABLE_SPHINX=ON -DLIBC_INCLUDE_DOCS=ON`
- Build with `ninja docs-libc-html`
…ter (llvm#78194)

This PR adds lowering the reference to a function that returns a
procedure pointer. It also fixed intrinsic ASSOCIATED to take such
argument.

---------

Co-authored-by: jeanPerier <[email protected]>
boomanaiden154 and others added 20 commits January 31, 2024 12:55
This patch adjusts the Docker container intended for CI use to contain a
PGO+ThinLTO+BOLT optimized clang. The toolchain is built within a Github
action and takes ~3.5 hours. No caching is utilized. The current PGO
optimization is fairly minimal, only running clang over hello world.
This can be adjusted as needed.
…encies

Removes the MaterializationResponsibility::addDependencies and
addDependenciesForAll methods, and transfers dependency registration to
the notifyEmitted operation. The new dependency registration allows
dependencies to be specified for arbitrary subsets of the
MaterializationResponsibility's symbols (rather than just single symbols
or all symbols) via an array of SymbolDependenceGroups (pairs of symbol
sets and corresponding dependencies for that set).

This patch aims to both improve emission performance and simplify
dependence tracking. By eliminating some states (e.g. symbols having
registered dependencies but not yet being resolved or emitted) we make
some errors impossible by construction, and reduce the number of error
cases that we need to check. NonOwningSymbolStringPtrs are used for
dependence tracking under the session lock, which should reduce
ref-counting operations, and intra-emit dependencies are resolved
outside the session lock, which should provide better performance when
JITing concurrently (since some dependence tracking can happen in
parallel).

The Orc C API is updated to account for this change, with the
LLVMOrcMaterializationResponsibilityNotifyEmitted API being modified and
the LLVMOrcMaterializationResponsibilityAddDependencies and
LLVMOrcMaterializationResponsibilityAddDependenciesForAll operations
being removed.
The inf and nan string index bounds checks were after the index was
being used. This patch moves the index usage to the end of the
condition.

Fixes llvm#79988
…TOC (llvm#79530)

This patch adds support for common and local symbols in the TOC for AIX.
Note that we need to update isVirtualSection so as a common symbol in
TOC will have the symbol type XTY_CM and will be initialized when placed
in the TOC so sections with this type are no longer virtual.

---------

Co-authored-by: Zaara Syeda <[email protected]>
…issue. (llvm#79398)

There are currently a few checkers that don't fill in the bug report's
"decl-with-issue" field (typically a function in which the bug is
found).

The new attribute `[[clang::suppress]]` uses decl-with-issue to reduce
the size of the suppression source range map so that it didn't need to
do that for the entire translation unit.

I'm already seeing a few problems with this approach so I'll probably
redesign it in some point as it looks like a premature optimization. Not
only checkers shouldn't be required to pass decl-with-issue (consider
clang-tidy checkers that never had such notion), but also it's not
necessarily uniquely determined (consider leak suppressions at
allocation site).

For now I'm adding a simple stop-gap solution that falls back to
building the suppression map for the entire TU whenever decl-with-issue
isn't specified. Which won't happen in the default setup because luckily
all default checkers do provide decl-with-issue.

---------

Co-authored-by: Balazs Benics <[email protected]>
Add a new node `AArch64ISD::URSHR_I_PRED`.

`srl(add(X, 1 << (ShiftValue - 1)), ShiftValue)` is transformed to
`urshr`, or to `rshrnb` (as before) if the result it truncated.

`uzp1(rshrnb(uunpklo(X),C), rshrnb(uunpkhi(X), C))` is converted to
`urshr(X, C)` (tested by the wide_trunc tests).

Pattern matching code in `canLowerSRLToRoundingShiftForVT` is taken
from prior code in rshrnb. It returns true if the add has NUW or if the
number of bits used in the return value allow us to not care about the
overflow (tested by rshrnb test cases).
Add TableGen patterns to convert more instructions to boolean
expressions:

- **mul -> and/or**: i1 multiply instructions currently cannot be
selected causing the compiler to crash. See
llvm#57404
- **select -> and/or**: Converting selects to and/or can enable more
optimizations. `InstCombine` cannot do this as aggressively due to
poison semantics.
If we can't produce a large enough index vector in i8, we may need to legalize
the shuffle (via scalarization - which in turn gets lowered into stack usage).
This change makes two related changes:
* Deferring legalization until we actually need to generate the vrgather
  instruction.  With the new recursive structure, this only happens when
  doing the fallback for one of the arms.
* Check the actual mask values for something outside of the representable
  range.

Both are covered by recently added tests.
The purpose of m_being_created in these classes was to prevent
broadcasting an event related to these Breakpoints during the creation
of the breakpoint (i.e. in the constructor). In Breakpoint and
Watchpoint, m_being_created had no effect. That is to say, removing it
does not change behavior.
However, BreakpointLocation does still use m_being_created. In the
constructor, SetThreadID is called which does broadcast an event only if
`m_being_created` is false. Instead of having this logic be roundabout,
the constructor instead calls `SetThreadIDInternal`, which actually
changes the thread ID. `SetThreadID` also will call
`SetThreadIDInternal` in addition to broadcasting a changed event.
This is a follow up to an item I noted in my submission comment for
e947f95. I don't have a real world example where this is triggering
unprofitably, but avoiding the transform when we estimate the loop to be
short running from profiling seems quite reasonable. It's also now come
up as a possibility in a regression twice in two days, so I'd like to
get this in to close out the possibility if nothing else.

The original review dropped the threshold for short trip count loops. I
will return to that in a separate review if this lands.
…80202)

This partially reverts commit aa964f1
because it caused perf regressions in rccl due to drop of -mllvm
-amgpu-kernarg-preload-count=16 from the linker step. Potentially it
could cause similar regressions for other HIP apps using -mllvm options
with -fgpu-rdc.

Fixes: SWDEV-443345
…m#79533)"

This reverts commit 209fe1f.

The original commit failed to due an assertion failure in the unit test
`ProgressReportTest` that the commit added. The Debugger::Initialize()
function was called more than once which triggered the assertion, so
this commit calls that function under a `std::call_once`.
…rts (llvm#79533)""

This reverts commit a5a8cbb.

The test being added by that commit still fails on the assertion that
Debugger::Initialize has been called more than once.
We don't have an AMO instruction for Nand, so with the A extension we
use an LR/SC loop. If we have Zacas we can use a CAS loop instead.

According to the Zacas spec, a CAS loop scales to highly parallel
systems better than LR/SC.
@jeffreytan81 jeffreytan81 requested review from Endilll and a team as code owners January 31, 2024 23:43
Copy link

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r e80b9436476bba714e843461e03227b222185f7b...2c3da33f7bcef61a4a12573043ae0c46a4d984ae lldb/test/API/commands/target/debuginfo/TestDebugInfoSize.py
View the diff from darker here.
--- TestDebugInfoSize.py	2024-01-31 23:43:10.000000 +0000
+++ TestDebugInfoSize.py	2024-01-31 23:46:07.853204 +0000
@@ -14,11 +14,10 @@
 MAIN_DWO_DEBUGINFO_SIZE = 385
 FOO_DWO_DEBUGINFO_SIZE = 380
 
 
 class TestDebugInfoSize(lldbtest.TestBase):
-
     def get_output_from_yaml(self):
         exe = self.getBuildArtifact("a.out")
         main_dwo = self.getBuildArtifact("a.out-main.dwo")
         foo_dwo = self.getBuildArtifact("a.out-foo.dwo")
 

@jeffreytan81
Copy link
Contributor Author

This PR history has been messed up. Let me create a new one.

@ldionne ldionne removed request for a team March 23, 2024 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.