Skip to content

Commit

Permalink
Add complex labeling scheme for landing pad
Browse files Browse the repository at this point in the history
Function signature based labeling scheme, follow the "Function types" mangling
rule defeind in Itanium C++ ABI.

With few specific rules:

- `main` funciton is using signature of
   `(int, pointer to pointer to char) returning int` (`FiiPPcE`).
- `_dl_runtime_resolve` use zero for the landing pad.
- {Cpp} member functions should use the "Pointer-to-member types" mangling rule
  defined in the _Itanium {Cpp} ABI_ <<itanium-cxx-abi>>.
- Virtual functions in {Cpp} should use the member function type of the base
  class that first defined the virtual function.
- If a virtual function is inherited from more than one base class, it should
  use the type of the first base class. Thunk functions will use the type of
  the corresponding base class.

Co-authored-by: Ming-Yi Lai <[email protected]>
  • Loading branch information
kito-cheng and mylai-mtk committed May 10, 2024
1 parent c992d06 commit 2c51928
Showing 1 changed file with 114 additions and 1 deletion.
115 changes: 114 additions & 1 deletion riscv-elf.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -662,6 +662,7 @@ using all other PLT sytle.
|===
| Default PLT | -
| Simple landing pad PLT | Must use this PLT style when `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_SIMPLE` is set.
| Complex landing pad PLT | Must use this PLT style when `GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_COMPLEX` is set.
|===

The first entry of a shared object PLT is a special entry that calls
Expand All @@ -673,6 +674,9 @@ dynamic linker before the executable is started. Lazy resolution of GOT
entries is intended to speed up program loading by deferring symbol
resolution to the first time the function is called.

The PLT entry is 16 bytes for the default PLT style and the simple landing pad
PLT style, and 32 bytes for the complex landing pad PLT style.

The first entry in the PLT occupies two 16 byte entries for the default PLT style:

[,asm]
Expand Down Expand Up @@ -704,7 +708,41 @@ And occupies three 16 byte entries for the simple landing pad PLT style:
nop
----

Subsequent function entry stubs in the PLT take up 16 bytes.
The complex landing pad PLT style occupies two 32 byte entries:

[,asm]
----
1: lpad 0
sub t1, t1, t3 # shifted .got.plt offset + hdr size + 24
auipc t2, %pcrel_hi(.got.plt)
addi t0, t2, %pcrel_lo(1b) # &.got.plt
l[w|d] t3, %pcrel_lo(1b)(t2) # _dl_runtime_resolve
addi t1, t1, -(hdr size + 24) # shifted .got.plt offset
srli t1, t1, log2(32/PTRSIZE) # .got.plt offset
l[w|d] t0, PTRSIZE(t0) # link map
jr t3
nop
nop
----


[,asm]
----
1: lpad 0
auipc t2, %pcrel_hi(.got.plt)
sub t1, t1, t3 # shifted .got.plt offset + hdr size + 24
l[w|d] t3, %pcrel_lo(1b)(t2) # _dl_runtime_resolve
addi t1, t1, -(hdr size + 24) # shifted .got.plt offset
addi t0, t2, %pcrel_lo(1b) # &.got.plt
srli t1, t1, log2(32/PTRSIZE) # .got.plt offset
l[w|d] t0, PTRSIZE(t0) # link map
jr t3
nop
nop
----

Subsequent function entry stubs in the PLT take up 16 bytes or 32 bytes depends
on the style.
On the first call to a function, the entry redirects to the first PLT entry
which calls `_dl_runtime_resolve` and fills in the GOT entry for subsequent
calls to the function.
Expand All @@ -727,6 +765,19 @@ The code sequences of the PLT entry for the the simple landing pad PLT style:
jalr t1, t3
----

The code sequences of the PLT entry for the the complex landing pad PLT style:
[,asm]
----
1: lpad <hash-value-for-function>
auipc t3, %pcrel_hi([email protected])
l[w|d] t3, %pcrel_lo(1b)(t3)
lui t2, <hash-value-for-function>
jalr t1, t3
nop
nop
nop
----

==== Procedure Calls

`R_RISCV_CALL` and `R_RISCV_CALL_PLT` relocations are associated with
Expand Down Expand Up @@ -1469,6 +1520,7 @@ a different features.
| Bit | Bit Name
| 0 | GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_SIMPLE
| 1 | GNU_PROPERTY_RISCV_FEATURE_1_CFI_SS
| 2 | GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_COMPLEX
|===

`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_SIMPLE` This bit indicate that all executable
Expand All @@ -1486,6 +1538,12 @@ compressed instructions then loading an executable with this bit set requires
the execution environment to provide the `Zicfiss` extension or to provide both
the `Zcmop` and `Zimop` extensions.

`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_COMPLEX` This bit indicate that all executable
sections are built to be compatible with the landing pad mechanism provided by
the `Zicfilp` extension. An executable or shared library with this bit set is
required to generate PLTs with the landing pad (`lpad`) instruction, and all
label are set to a value which hashed from its function signature.

=== Mapping Symbol

The section can have a mixture of code and data or code with different ISAs.
Expand Down Expand Up @@ -1529,6 +1587,61 @@ attribute is recording for minimal execution environment requirements, so the
ISA information from arch attribute is not enough for the disassembler to
disassemble the `rv64gcv` version correctly.

== Label Value Compuatation for Complex Labeling Scheme Landing Pad

The label value for the complex labeling scheme landing pad is computed from the
hash of the function signature string, which uses the same scheme as the
"Function types" mangling rule defined in the _Itanium {Cpp} ABI_
<<itanium-cxx-abi>>, the value is taken from the lower 20 bits of the MD5
hash result of the function signature string.

Additionally, here are a few specific rules for {Cpp} member functions:

- {Cpp} member functions should use the "Pointer-to-member types" mangling rule
defined in the _Itanium {Cpp} ABI_ <<itanium-cxx-abi>>.
- Virtual functions in {Cpp} should use the member function type of the base
class that first defined the virtual function.


Example:

[,cxx]
----
double foo(int, float *);
class Base
{
public:
virtual void memfunc1();
virtual void memfunc2(int);
};
class Derived : public Base
{
public:
virtual void memfunc1();
virtual void memfunc3(double);
void memfunc4();
};
class DerivedDerived : public Derived
{
public:
virtual void memfunc2(int);
virtual void memfunc3(double);
};
----

The function signatures for the above functions are described below:

- `foo` is encoded as `FdiPfE`.
- `Base::memfunc1` and `Derived::memfunc1` are both encoded as `M4BaseFvvE`.
- `Base::memfunc2` and `DerivedDerived::memfunc2` are both encoded as `M4BaseFviE`.
- `Derived::memfunc3` and `DerivedDerived::memfunc3` are both encoded as `M7DerivedFvdE`.
- `Derived::memfunc4` is encoded as `M7DerivedFvvE`.

== Linker Relaxation

At link time, when all the memory objects have been resolved, the code sequence
Expand Down

0 comments on commit 2c51928

Please sign in to comment.