Implement an alternative layout engine #468 #477

brenoguim · 2023-03-06T23:52:26Z

I'm marking this as a draft PR because the main goal is to provide basis to think and discuss. The change is too agressive and so it's wiser to discuss before continuing to invest time in this approach.
In this description I'll talk about what is being done here, the pros and cons, known issues (and known bugs it fixes) and then I'll go into the details of the patch.

This patch create a separate elf-independent class that is responsible to decide where to place sections after resizing. The goal is to create a class that is easily testable, meaning that we can manually write inputs to it and verify the output.
I did not get into the part of adding these tests that would be the highlight of the patch because I wanted to share what I have so far.

Because this class is elf-independent, you can expect to find the following steps in the connection to Patchelf:

Generate the elf-independent layout: elf2layout function
Tell the LayoutEngine which sections must be resized: LayoutEngine::resize
Trigger the LayoutEngine to calculate the new layout: LayoutEngine::updateFileLayout
Read the layout and apply it to Patchelf current datastructures (like phdrs and shdrs): layout2elf

The LayoutEngine methods will always return false should any step fail. That allow us to fallback to the current code if the new code can't properly layout something.

Because of this, this patch introduces two switches: tryLayoutEngine which will trigger the new engine and fallback if it fails, and onlyLayoutEngine which will error out if the new engine fails. The regression tests are passing with onlyLayoutEngine.

The scary part is that all tests are passing even though some things are not implemented:

normalizeNoteSegments is not called, so some note segments could get out of sync
Synchronization of .note.gnu.property and .MIPS.abiflags from writeReplacedSections are not called
There is no difference between executable or shared library. The new segments are always added in the end.
binutilsQuirkPadding is not added

Now on the flip side:

The new engine introduces new segments much less frequently because it first attempts to find whatever hole we have in the address space
If it can't find a hole, it introduces one LOAD segment with the same access rights (RWE) as the segment from which the section was removed.

I changed the CI to invoke tests with both current layout engines and the new one

If you made it until here, I'll describe a bit of the idea behind the LayoutEngine.
The interface exposes the following classes and methods:

Section: This is anything occupies space in the file. So, differently from ELF, the header, the program header table, and the section header table are also considered sections. Aside from the obvious fields name, type, access and align, it also has a pinned field to indicate that this section should not be moved in the virtual address space.

Segment: This is a load segment. The engine deals only with segments that make up the address space.

Layout: A group of Sections and Segments.

LayoutEngine: This uses the PImpl idiom to hide all implementation details from the user and expose truly only the needed methods: constructor, resize, updateFileLayout, layout and getVirtualAddress.

The best way to see how these objects are used is to look at the methods that create the layout from elf and the ones that read the layout to update the elf structures.

The LayoutEngine implementation is based on the following idea:

First it reads the input Sections and Segments to build a datastructure that represents the virtual address space.
Then the transformations are made to this virtual address space structure
And finally, when updateFileLayout is called, the code iterates on the virtual address space structure and update Section and Segment fields.

The virtual address space structure is just a vector of VSegment which can be thought of a segment that was mapped into virtual memory, and each VSegment contains a vector of VSections which are the sections that were loaded by the VSegment.

The whole idea is to focus on the virtual address space because that is what imposes constraints, while file layout is just a byproduct.

Mic92 · 2023-03-08T08:02:35Z

src/patchelf.cc

+  [--try-layout-engine]\n\
+  [--only-layout-engine]\n\


TODO: help text.

Also this will eventually need some manpage text.

I wonder what is the best approach.
If this is to be merged, perhaps we can avoid documenting it in a first moment until more testing is done.
Then at some point we enable and document it. And further down the line we remove that fallback.

But before any of this happens, I'm currently working on testing.

I'm running some stress test (running Patchelf on all binaries in the machine) and I found a couple of issues both in the current implementation and the layout engine.

Moreover I will go through the existing code making sure that at least one test fails of we disable any of the special ifs in the code.

It's easier to test and receive if there is documentation. If the interface is not stable than a better approach is to declare this as an unstable interface that may change.

Implement an alternative layout engine

c7fd649

Mic92 reviewed Mar 8, 2023

View reviewed changes

brenoguim mentioned this pull request Mar 19, 2023

patchelf-0.17.2 seems to corrupt emacs on staging-next #482

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement an alternative layout engine #468 #477

Implement an alternative layout engine #468 #477

brenoguim commented Mar 6, 2023

Mic92 Mar 8, 2023

Mic92 Mar 8, 2023

brenoguim Mar 9, 2023

Mic92 Apr 23, 2023

Implement an alternative layout engine #468 #477

Are you sure you want to change the base?

Implement an alternative layout engine #468 #477

Conversation

brenoguim commented Mar 6, 2023

Mic92 Mar 8, 2023

Choose a reason for hiding this comment

Mic92 Mar 8, 2023

Choose a reason for hiding this comment

brenoguim Mar 9, 2023

Choose a reason for hiding this comment

Mic92 Apr 23, 2023

Choose a reason for hiding this comment