Implement an alternative layout engine #468 #477
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I'm marking this as a draft PR because the main goal is to provide basis to think and discuss. The change is too agressive and so it's wiser to discuss before continuing to invest time in this approach.
In this description I'll talk about what is being done here, the pros and cons, known issues (and known bugs it fixes) and then I'll go into the details of the patch.
This patch create a separate elf-independent class that is responsible to decide where to place sections after resizing. The goal is to create a class that is easily testable, meaning that we can manually write inputs to it and verify the output.
I did not get into the part of adding these tests that would be the highlight of the patch because I wanted to share what I have so far.
Because this class is elf-independent, you can expect to find the following steps in the connection to Patchelf:
elf2layout
functionLayoutEngine::resize
LayoutEngine::updateFileLayout
layout2elf
The
LayoutEngine
methods will always return false should any step fail. That allow us to fallback to the current code if the new code can't properly layout something.Because of this, this patch introduces two switches:
tryLayoutEngine
which will trigger the new engine and fallback if it fails, andonlyLayoutEngine
which will error out if the new engine fails. The regression tests are passing withonlyLayoutEngine
.The scary part is that all tests are passing even though some things are not implemented:
normalizeNoteSegments
is not called, so some note segments could get out of sync.note.gnu.property
and.MIPS.abiflags
fromwriteReplacedSections
are not calledbinutilsQuirkPadding
is not addedNow on the flip side:
I changed the CI to invoke tests with both current layout engines and the new one
If you made it until here, I'll describe a bit of the idea behind the LayoutEngine.
The interface exposes the following classes and methods:
Section
: This is anything occupies space in the file. So, differently from ELF, the header, the program header table, and the section header table are also considered sections. Aside from the obvious fieldsname
,type
,access
andalign
, it also has apinned
field to indicate that this section should not be moved in the virtual address space.Segment
: This is a load segment. The engine deals only with segments that make up the address space.Layout
: A group ofSection
s andSegment
s.LayoutEngine
: This uses thePImpl
idiom to hide all implementation details from the user and expose truly only the needed methods:constructor
,resize
,updateFileLayout
,layout
andgetVirtualAddress
.The best way to see how these objects are used is to look at the methods that create the layout from elf and the ones that read the layout to update the elf structures.
The
LayoutEngine
implementation is based on the following idea:Section
s andSegment
s to build a datastructure that represents the virtual address space.updateFileLayout
is called, the code iterates on the virtual address space structure and updateSection
andSegment
fields.The virtual address space structure is just a vector of
VSegment
which can be thought of a segment that was mapped into virtual memory, and eachVSegment
contains a vector ofVSections
which are the sections that were loaded by theVSegment
.The whole idea is to focus on the virtual address space because that is what imposes constraints, while file layout is just a byproduct.