Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Unsafe Set Enum Discriminants #3727

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

jamesmunns
Copy link
Member

@jamesmunns jamesmunns commented Nov 8, 2024

Summary

This RFC proposes a way to write the discriminant of an enum when building it "from scratch". This introduces two new library components, an unsafe set_discriminant function, and a discriminant_of! macro, which can be used with any enum, regardless of repr or any other details.

This RFC is a follow-on to the offset_of_enum feature, tracked by rust-lang/rust#120141.

Rendered

@ehuss ehuss added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Nov 8, 2024
@jamesmunns

This comment was marked as resolved.

@ehuss ehuss added T-lang Relevant to the language team, which will review and decide on the RFC. T-opsem Relevant to the operational semantics team, which will review and decide on the RFC. labels Nov 8, 2024

When this function is called, it MUST be called AFTER fully initializing the variant of the enum completely, as in some cases it may be necessary to read back these values. This is discussed more in the next section.

Semantically, `set_discriminant` is specified to optionally write the discriminant (when necessary), and read-back the discriminant. If the read-back discriminant does not match the expected value, then the behavior is undefined.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung I would appreciate if you could let me know if I got this right! I don't totally understand the "mandatory readback", so if this could be stated better, please feel free to propose alternate wording.

Copy link
Contributor

@traviscross traviscross Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good also for this (or the following) section to be extended with more details about why this is the correct semantic. I think I see the case for and agree that setting a MaybeUninit<Option<&u8>> to Some with set_discriminant should represent an assertion that the relevant bits aren't zero at the point of that call, but it's natural to wonder about this, and so it'd be good to describe it in some detail.

@jamesmunns
Copy link
Member Author

Addressed all review comments so far. Link to the diff of changes since the initial version: https://github.com/rust-lang/rfcs/pull/3727/files/5c74b691972d36d30e27f253e8e2673d9930316f..eb486ef529eea98b7d2d9b0ffcd2254ac16a4ea0

@coolreader18
Copy link

coolreader18 commented Nov 14, 2024

What's the current legality of initializing niche-optimized enums?

let mut x: Option<NonNull<u8>> = None;
(&raw mut x).cast::<NonNull<u8>>().write(NonNull::dangling()); // UB?

I suppose there are guarantees made about the layout of Option for FFI purposes, but this feels... sketchy (though miri says it's fine, including with more complex niches like char). That to say, unless these semantics are already defined, I feel like the section about calling set_discriminant on niche-optimized enums should be strengthened from "encouraged for explicitness" to "should"/"must". Maybe with an exception for Option/Result-like enums with a single niche at 0, since FFI-compatibility guarantees are already made for those, but I'd err on not allowing it if possible.

@matthieu-m
Copy link

I was thinking about the (possible) future alternative receivers for set_discriminant:

  • Having to call .as_mut_ptr() is not so bad, what is more annoying is the loss of guarantees. Yes my &mut MaybeUninit has given me a non-null & sufficiently-aligned pointer, thank you very much. This is the true boilerplate.
  • Having set_discriminant_mu and set_discriminant_nn (or whatever) feels... heavy. Remembering the specific names is going to be annoying.
  • A SetDiscriminant::set_discriminant trait or a set_discriminant<T: SealedDiscriminant>(t: T, ..) method muddle safety guarantees: it makes it harder, at the call site, to ensure that the expected receiver is used, and it makes it brittle with regard to refactorings. Change the "receiver" from NonNull<T> to *mut T? Make sure the pointer is non-null now!
  • Inherent methods on NonNull and MaybeUninit directly is another possibility. If just called set_discriminant they would have some of the disadvantages of the trait-based approach, but for MaybeUninit a write_discriminant inherent method could be more in keeping with the current write... API and the different name would prevent a maintenance mistake.

So, all in all, for now I see two alternatives I like:

  • Don't introduce any variant. Bit boilerplatey in safety guarantees, but very explicit, and explicit is not so bad for unsafe.
  • Only introduce a single variant: MaybeUninit::<T>::write_discriminant(&mut self, disc: Discriminant<T>). It's natural to have, and offers a lot of guarantees that *mut T doesn't -- non null, sufficiently aligned, exclusive access -- leaving only "correctly initialized" to fulfill.

Comment on lines +290 to +298
## Should we provide alternate forms of `set_discriminant`?

There was some discussion when writing this RFC what the type of the first argument to `set_discriminant` should be:

* `&mut MaybeUninit<T>`
* `NonNull<T>`
* `*mut T`

`*mut T` was chosen as the most general option, however it is likely desirable to accept other forms as well for convenience.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolved questions are things that are intended to be required to be answered before stabilization. In this case, the other forms could always be added later, so it may be worth moving this to a future possibility instead.

That is, unless you mean for there to be an open question about whether we'd want the *mut T form at all, but I think @RalfJung sufficiently addressed that in the Zulip thread about how it would be inconsistent with our many other such functions for there to not be a *mut T form of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC. T-opsem Relevant to the operational semantics team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants