Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Target Modifiers #3716

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

Darksonn
Copy link
Contributor

  • We introduce the concept of "target modifier". A target modifier is a flag where it may be unsound if you link together two compilation units that disagree on the flag.
  • We fail the build if rustc can see two Rust compilation units that do not agree on the exact set of target modifier flags.
  • There are already several existing flags that could fall into this category. There are also hypothetical new flags that do.
  • The error can be silenced using the -Cunsafe-allow-abi-mismatch escape hatch.
  • Not having a stable way to build stdlib crates does not block stabilization of target modifiers.
  • As a future extension we may be able to relax the rules to allow some specific kinds of mismatches.
  • This RFC does not stabilize any target modifiers. That should happen in follow-up MCPs/FCPs/RFCs/etc.

Rendered

Signed-off-by: Alice Ryhl <[email protected]>
@jieyouxu jieyouxu added T-compiler Relevant to the compiler team, which will review and decide on the RFC. A-target_feature target_feature related proposals & ideas labels Oct 24, 2024
float), and Q (128-bit hardware float) extensions all can [potentially change
the ABI][riscv-float], which would increase the number of required targets. The
[E extension][riscv-e] may also change the ABI by limiting the number of
registers used by the I (integer operations) extension.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually RISC-V is one of the few targets that handles this properly. The target features control what instructions codegen is allowed to use while the ABI is specified separately in the target spec. This means that you can enable FP instructions while keeping a soft-float ABI. Notice how the E target uses a separate ABI.

It's the other targets that often have target features which change the ABI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change that does that was merged in the time between writing and posting this RFC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, because I'm the one that signed off on that change being merged.

And rv32e target HAS to have its ilp32e ABI specified, because that's the only avenue for how you inform LLVM that you want to do codegen for rv32e.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the other targets that often have target features which change the ABI.

So, that's what I thought, too. That RISC-V extensions were additive.

How do these "subtractive" extension play with target triples that define a base set of extensions, e.g. is this sound

rustc --target riscv32imac-unknown-none-elf --target-features e

Would it make sense to add a bare set of riscv[32|64]-unknown-<platform>-<libc> target triples (in addition to current triples), and allow user to explicitly add all of the extensions they want to use?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And rv32e target HAS to have its ilp32e ABI specified, because that's the only avenue for how you inform LLVM that you want to do codegen for rv32e.

It looks like the RV32E extension is incompatible with the D extension because of the ILP32E calling convention (from riscv-elf-psabi-doc):

The ILP32E calling convention is not compatible with ISAs that have registers that require load and store alignments of more than 32 bits.

In particular, this calling convention must not be used with the D ISA extension.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the riscv situation so well. Happy to reword if the paragraph is not accurate.

So, it looks accurate to me. Maybe the discussion about target triples is related, but somewhat tangential to this RFC. It may be something that requires its own RFC: when to introduce a new set of target triples for RISC-V targets.

It's a situation that should be fairly rare, especially as the ISA matures even more. However, there are no target triples that specify a riscv32e-unknown-none-elf, for example.

I'm open to working on such an RFC, and linking back to this RFC for the discussion of incompatible RISC-V extensions.

The more I think about it, the more adding a base set of bare RISC-V targets makes sense (following from the riscv64-linux-android target):

  • riscv32-unknown-none-elf
  • riscv64-unknown-none-elf
  • ...

I would think that an attempt to use --target-features e on an I target should be rejected.

I agree, that would make sense to me as well. Unless LLVM is doing some magic to resolve when both E and I extensions are requested, only E is used?

Even then, it would be nice for rustc to reject the pairing using the techniques discussed in this RFC.

More rejections for --target-features e:

  • riscv32gc-*
  • riscv64gc-*
  • any other targets that implicitly include the D and/or I extension
  • --target-features d
  • --target-features i

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM rejects it.

  if (HasI && HasE)
    return getIncompatibleError("i", "e");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rmsyn

It's a situation that should be fairly rare, especially as the ISA matures even more. However, there are no target triples that specify a riscv32e-unknown-none-elf, for example.

Are you sure about that? https://github.com/rust-lang/rust/blob/a06b7cbe21967a86050fa92dab843c8afda1c28e/compiler/rustc_target/src/spec/targets/riscv32e_unknown_none_elf.rs

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rmsyn

It's a situation that should be fairly rare, especially as the ISA matures even more. However, there are no target triples that specify a riscv32e-unknown-none-elf, for example.

Are you sure about that? https://github.com/rust-lang/rust/blob/a06b7cbe21967a86050fa92dab843c8afda1c28e/compiler/rustc_target/src/spec/targets/riscv32e_unknown_none_elf.rs

Ah, I guess not... I was looking at the output of rustc --print target-list on v1.82.0 stable. Thanks for the link.

Copy link
Member

@RalfJung RalfJung Nov 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that you can enable FP instructions while keeping a soft-float ABI.

That's actually fine on most targets, e.g. on x86 if you set the soft-float target feature then you can enable x87 and SSE and whatnot and keep using the soft-float ABI. Only aarch64 screws this up (from the targets I checked so far).

The interesting question is what happens on RISC-V when I disable the FP instructions on a target that usually uses the hardfloat ABI. Sadly, what LLVM usually does in that case is silently fall back to the softfloat ABI. But I haven't checked for RISC-V.

text/3716-target-modifiers.md Outdated Show resolved Hide resolved
example, if you build four CUs A,B,C,D where D depends on A,B,C and C is the
only one with a different value for `flagname`, then the mismatch is detected
by rustc when compiling D, but `-Cunsafe-allow-abi-mismatch` should be used
when compiling C.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also require the flag when building D?
In particular, it seems good to require the flag when building the final artifact, to ensure there is some proximity to the place where "everything comes together". When building C, I don't know yet what else will be linked into this binary, so the flag being sound may depend on what happens later when D is built.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? If we consider the scenario I mention in the RFC:

However, real-world scenarios where mismatching a target modifier is necessary are quite uncommon. The only case I'm aware of is the runtime for a sanitizer. For example, when ASAN (address sanitizer) detects a bug, it calls into a special ASAN-failure-handler function. The function for handling ASAN-failures should not be sanitized.

then I think the behavior I proposed is the right one. The CU containing the sanitizer runtime is clearly the odd-one-out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so there are situations where the crate C can locally say "yes I am sure that this is fine and the rest of the code doesn't have to worry about this".

But the RFC makes this a blanket recommendation for all cases, which I don't think is warranted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only use case we can think of is an ASAN runtime, it makes me question if we want this flag at all. Sanitizer runtimes are compiler-internal and can use all kinds of funny ways to subvert these checks. If we don't have a use case for ignoring modifier mismatches in user code, then I think we shouldn't add it.
I'd say that we should ship this feature without a stable way to disable it. If users then come to us with good use cases for allowing mismatches we can add it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other use-cases. Mostly, they are insanely subtle, like the combinations that are relevant to aarch64 with shadow-stack and/or fixed-x18 and a few other features, so they're somewhat inappropriate to discuss in the RFC (aside from "no, truly, they are insanely subtle", perhaps).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, in that case it makes sense to have it. It would be useful to have a few of them listed out to make sure that the "-Cunsafe-allow-abi-mismatch ignores the current crate from mismatch checks" behavior is indeed the right one. But I'm inclined to believe that it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in my opinion I almost want to say that we should broadly avoid canonizing certain combinations of ABI flags, in the very likely event that they are incorrect in some case we have not yet foreseen.

text/3716-target-modifiers.md Outdated Show resolved Hide resolved
Comment on lines +298 to +302
* When rustc is not doing the final link, different incompatible leaf modules
might not get detected. For instance, using the CUs A,B,C from [the previous
section][not-all-unsound], then if stdlib is CU B and there are two leaf CUs
A and C, then the incompatibility between A and C would not get detected
unless rustc performs the final link.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this example... so this is linking two Rust crates but not with rustc? When does that happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine the Linux kernel where you have two built-in drivers. Neither driver depends on the other driver. Each driver is compiled into an object file by rustc. When performing the final link of all of the object files that make up the kernel, that is done using the usual C linker and not by rustc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will target modifier information be available at that point? Wondering if the system linkers could theoretically learn to detect this in some form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could. In fact, .note.gnu.property is an example of exactly that.

text/3716-target-modifiers.md Outdated Show resolved Hide resolved
@elichai
Copy link

elichai commented Nov 4, 2024

I just want to say thank you for incorporating great errors and not unparsable linker errors ❤️

text/3716-target-modifiers.md Show resolved Hide resolved
Comment on lines +298 to +302
* When rustc is not doing the final link, different incompatible leaf modules
might not get detected. For instance, using the CUs A,B,C from [the previous
section][not-all-unsound], then if stdlib is CU B and there are two leaf CUs
A and C, then the incompatibility between A and C would not get detected
unless rustc performs the final link.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will target modifier information be available at that point? Wondering if the system linkers could theoretically learn to detect this in some form.

text/3716-target-modifiers.md Show resolved Hide resolved
Comment on lines +96 to +99
This problem is a new discovery and it's still not clear how to solve it.
Target modifiers will not solve all of the flags; [some flags are just
unfixable and need to be removed][issue130968]. But I expect that target
modifiers will be the solution for some of these flags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an issue listing flags that we will want to make target modifiers? Might be good to collect future work in one place if that doesn't exist yet.

@davidtwco
Copy link
Member

I think this makes a lot of sense, it's had some time to attract feedback and all of the feedback seems to have been addressed, so I'll start an FCP on it.

@rustbot fcp merge

@davidtwco
Copy link
Member

It turns out you need to use the correct bot to do that.

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Nov 8, 2024

Team member @davidtwco has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Nov 8, 2024
Copy link
Member

@SparrowLii SparrowLii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It dose makes much sense! Just some small questions

I don't see description for interop, so I think this rfc only deals with compilation between rust crates? There are a lot of projects involving interop (mostly Rust-C/C++) in real scenarios, hopefully there will be similar solutions in the future.

text/3716-target-modifiers.md Show resolved Hide resolved
text/3716-target-modifiers.md Show resolved Hide resolved
Comment on lines +107 to +109
machine code. Some flags can be turned on and off for each CU (compilation
unit) in your project separately, and other flags must be applied to the entire
application as a whole. The typical reason for flags to be in the latter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
machine code. Some flags can be turned on and off for each CU (compilation
unit) in your project separately, and other flags must be applied to the entire
application as a whole. The typical reason for flags to be in the latter
machine code. Some flags can be turned on and off for each crate (which is a compilation unit)
in your project separately, and other flags must be applied to the entire
application as a whole. The typical reason for flags to be in the latter

nit: we have a fancy word for compilation units already :) (then change all later mentions of "CU" to "crate")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this change awkward in "Problems with mixing non-target-modifiers" since it speaks in general terms and would also apply to C or C++ compilation units.

text/3716-target-modifiers.md Show resolved Hide resolved
example, if you build four CUs A,B,C,D where D depends on A,B,C and C is the
only one with a different value for `flagname`, then the mismatch is detected
by rustc when compiling D, but `-Cunsafe-allow-abi-mismatch` should be used
when compiling C.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only use case we can think of is an ASAN runtime, it makes me question if we want this flag at all. Sanitizer runtimes are compiler-internal and can use all kinds of funny ways to subvert these checks. If we don't have a use case for ignoring modifier mismatches in user code, then I think we shouldn't add it.
I'd say that we should ship this feature without a stable way to disable it. If users then come to us with good use cases for allowing mismatches we can add it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-target_feature target_feature related proposals & ideas disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. T-compiler Relevant to the compiler team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.