-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce new relocation for landing pad #452
base: complex-label-lp
Are you sure you want to change the base?
Changes from 1 commit
db7c38a
0726ba1
1e21e42
5d43b57
32688be
02546de
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -548,7 +548,9 @@ Description:: Additional information about the relocation | |
<| S - P | ||
.2+| 65 .2+| TLSDESC_CALL .2+| Static | .2+| Annotate call to TLS descriptor resolver function, `%tlsdesc_call(address of %tlsdesc_hi)`, for relaxation purposes only | ||
<| | ||
.2+| 66-190 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use | ||
.2+| 66 .2+| LPAD .2+| Static | .2+| Annotates the landing pad instruction inserted at the beginning of the function. The addend indicates the label value of the landing pad, and the symbol value is the address of the mapping symbol for the function signature, which will have the same address as the function. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this relocation only for the func-sig scheme? Based on its description, it looks like so, but the following There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That should also work for unlabeled scheme as well, let me think how to make it clearly. |
||
<| | ||
.2+| 67-190 .2+| *Reserved* .2+| - | .2+| Reserved for future standard use | ||
<| | ||
.2+| 191 .2+| VENDOR .2+| Static | .2+| Paired with a vendor-specific relocation and must be placed immediately before it, indicates which vendor owns the relocation. | ||
<| | ||
|
@@ -1582,6 +1584,7 @@ A number of symbols, named mapping symbols, describe the boundaries. | |
| $x.<any> | ||
| $x<ISA> .2+| Start of a sequence of instructions with <ISA> extension. | ||
| $x<ISA>.<any> | ||
| $s<function-signature-string> | Marker for the landing pad instruction. This should only be used with the function signature-based scheme and should be placed only at the beginning of the function. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't quite get the purpose of this mapping symbol: It looks like the only reference to these symbols come from the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's kinda debugging propose only, so it safe to strip like all other mapping symbols There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the purpose is to display function signatures when disassembling, this mechanism seems a bit incomplete (?) I suppose since the relocation is a static one, it would not stay in the binary after static linking, thus if a user disassembles a linked ELF, it's still the label numbers instead of signatures that get displayed? Update: Assuming it's relying on the mapping symbol having the same address as the lpad insn to associate an lpad insn to a function signature (so that the signature can be displayed when disassembling a linked binary), why do relocations refer to these symbols? |
||
|=== | ||
|
||
The mapping symbol should set the type to `STT_NOTYPE`, binding to `STB_LOCAL`, | ||
|
@@ -2317,6 +2320,85 @@ instructions. It is recommended to initialize `jvt` CSR immediately after | |
csrw jvt, a0 | ||
---- | ||
|
||
==== Landing Pad Relaxation | ||
|
||
Target Relocation::: R_RISCV_LPAD | ||
|
||
Description:: This relaxation type can relax lpad instruction into a none, | ||
which removed the lpad instruciton. | ||
kito-cheng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
This relaxation type can be performe even without `R_RISCV_RELAX`, | ||
but the linker should pad nop instruciton to the same length of the original | ||
kito-cheng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
instruction sequence. | ||
kito-cheng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Condition:: The associated function of this lpad must have local visibility, and | ||
it must not be referenced by any relocation other than `R_RISCV_CALL` and | ||
`R_RISCV_CALL_PLT`. | ||
This relaxation can also be performed when the function has global visibility, | ||
if the symbol does not have a corresponding PLT entry and is not referenced by | ||
the GOT or by any relocation other than `R_RISCV_CALL` and `R_RISCV_CALL_PLT`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can explain the behavior more clearly if we avoid mentioning symbol visibility. The only important thing here is whether or not a symbol is visible to other ELF modules, i.e. whether or not the symbol is in the dynamic symbol table. Symbol visibility is just one way to control it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good suggestion, applied 02546de :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How should we find a function symbol for a given R_RISCV_LPAD relocation? Should we just look for a function symbol at the same location as the relocation refers to? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right now, this design requires scanning through the symbol table and relocation table once to figure out which symbols have landing pads. I had previously thought about creating a new section to handle this, but I realized that when dealing with linker relaxation, the best way is still to use relocations to mark them. This approach also avoids introducing a new section format. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Does this mean that Note: The complexities come from the following algorithm: HashMap<Address, LabelLabel> LpadRelocMap;
HashMap<FunctionSymbol, LpadLabel> LpadLabelMap;
for (auto R: LpadRelocations)
LpadRelocMap.insert(R.Address, R.addend);
for (auto S: FunctionSymbols)
if (S.Address in LpadRelocMap) lpadLabelMap.insert(S, LpadRelocMap[S.Address]); |
||
|
||
Relaxation:: | ||
- Lpad instruciton associated with `R_RISCV_LPAD` can be removed. | ||
- Lpad instruciton associated with `R_RISCV_LPAD` can be replaced with nop | ||
kito-cheng marked this conversation as resolved.
Show resolved
Hide resolved
|
||
instruction if the relacation isn't paired with `R_RISCV_RELAX`. | ||
|
||
Example:: | ||
+ | ||
-- | ||
Relaxation candidate: | ||
[,asm] | ||
---- | ||
lpad 0x123 # R_RISCV_LPAD, R_RISCV_RELAX | ||
---- | ||
|
||
Relaxation result: | ||
[,asm] | ||
---- | ||
# No instruction | ||
---- | ||
Can be relaxed into `nop` if no `R_RISCV_RELAX` is paired with `R_RISCV_LPAD`. | ||
[,asm] | ||
---- | ||
nop | ||
---- | ||
-- | ||
|
||
==== Landing Pad Scheme Relaxation | ||
|
||
Target Relocation::: R_RISCV_LPAD | ||
|
||
Description:: This relaxation type allows an `lpad` instruction to be relaxed | ||
into `lpad 0`, which is a universal landing pad that ignores the label value | ||
comparison. This relaxation is used when the label value is not computed | ||
correctly. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what would be the cases where a label may be computed incorrectly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Some legacy programs don’t properly declare function prototypes before calling them. In these cases, the compiler will infer a function prototype based on the language standards, but it often ends up being incorrect. One common example is dhrystone[1]. In most versions you find online, Func_2 isn’t declared before it’s called, so the compiler will assume the prototype is [1] https://github.com/sifive/benchmark-dhrystone/blob/master/dhry_1.c#L164 Another common potential issue in C is with qsort. Function pointers can be compatible but not perfectly match the expected type. For example, here’s how qsort is declared: void qsort(void* ptr, size_t count, size_t size, int (*comp)(const void*, const void*)); But in practice, you can pass in a compatible, but not exactly matching, type for the comparison function, and it works in most cases: #include <stdlib.h>
int compare(int *a, int *b) // The signature isn’t int (*)(const void*, const void*)
{
return *(int *)a - *(int *)b;
}
void foo(int *x, size_t count, size_t size)
{
qsort(x, count, size, compare); // But in practice, this works fine
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But how is the linker expected to know the incorrectness so it can perform this relaxation? The Zicfilp mechanism is employed when issuing an indirect call through function pointers, and when calling functions through PLT: In the first case (indirect calls through pointers), to know that an lpad insn needs to be relaxed to In the second case (calls through PLT), the indirect call happens in the PLT, which is generated by linkers. The label which linkers use to generate PLT would come from the addend of the The above is my guess and understanding of the intended usage of this relaxation. If we're not on the same page, please do let me know. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Linker never know (or not always know), and also that's not the right layer to analysis (or guess:P ), so I expect that relaxation should only enabled when user pass something like |
||
|
||
Condition:: This relaxation can be performed without `R_RISCV_RELAX`, and | ||
should not be enabled by default. The user must explicitly enable this | ||
relaxation, and it should only be applied during static linking. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Q: what happens in case of dynamic linking ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I gave it some more thought, and this can actually be applied beyond just static linking. However, dependent shared libraries won’t automatically convert along with it. I’ve removed that limitation and added a NOTE to explain the situation. |
||
|
||
Relaxation:: | ||
- Lpad instruction associated with `R_RISCV_LPAD` will be replaced with | ||
`lpad 0`. | ||
|
||
Example:: | ||
+ | ||
-- | ||
Relaxation candidate: | ||
[,asm] | ||
---- | ||
lpad 0x123 # R_RISCV_LPAD | ||
---- | ||
|
||
Relaxation result: | ||
[,asm] | ||
---- | ||
lpad 0 | ||
---- | ||
-- | ||
|
||
NOTE: This relaxation is designed to be compatible with legacy programs that | ||
may not declare the function signature correctly. | ||
|
||
[bibliography] | ||
== References | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the "label value of the landing pad?