Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relaxed DST field ordering #3713

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

clarfonthey
Copy link
Contributor

@clarfonthey clarfonthey commented Oct 19, 2024

Summary

Relax the requirements on struct field ordering for dynamically sized fields for repr(Rust) and repr(transparent), such that ?Sized fields can be anywhere in the field list, as long as there is only one.

Note that this RFC previously added changes to repr(C), but these have since been removed.

Rendered

@clarfonthey
Copy link
Contributor Author

Also going to pre-empt this because, while it's not mentioned in the RFC because I genuinely could not find any info in the reference about it, it will probably come up:

What do we do about references to ZST fields after the DST in repr(C) structs? I'll be honest, I have no idea if references to ZST fields are even meaningful/useful or used, but right now there appears to be no documentation on it.

If people actually use references to ZST fields in repr(C) structs, I propose defining fields after the DST field to always have the same offset as the DST field itself, i.e., if said field were zero. We can treat this internally be reordering the ZSTs right before the DST.

But I had been effectively treating this RFC as if every reference to a ZST doesn't matter, and if people use ptr::eq for them they're doing something wrong and unspecified. Although I'm sure this is interesting enough to give a few people a headache!

@zachs18
Copy link
Contributor

zachs18 commented Oct 19, 2024

[in repr(C) structs] I propose defining fields after the DST field to always have the same offset as the DST field itself, i.e., if said field were zero. We can treat this internally be reordering the ZSTs right before the DST.

I don't think we should allow putting fields after the DST tail in a repr(C) struct if they are just going to be reordered anyway. IMO the field order should be consistent1; either we don't allow fields after the DST tail in repr(C) structs, or we allow them and they are at an appropriate (dynamic) offset after the DST tail (though it's IMO fine if this is restricted to ZST fields, for now).

But I had been effectively treating this RFC as if every reference to a ZST doesn't matter, and if people use ptr::eq for them they're doing something wrong and unspecified. Although I'm sure this is interesting enough to give a few people a headache!

I'm not sure what you mean here, but I don't think it is reasonable to assume that std::ptr::eq is unspecified or wrong for ZSTs; the documentation for that function does not currently have any exceptions for ZSTs.

Footnotes

  1. i.e. in a repr(C) struct, each field should be at offset prev_field_offset + prev_field_size, rounded up to this field's alignment, with the first field at offset 0.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Oct 19, 2024

I'm not sure what you mean here, but I don't think it is reasonable to assume that std::ptr::eq is unspecified or wrong for ZSTs; the documentation for that function does not currently have any exceptions for ZSTs.

You're right that you can make ZST pointers just fine (and indeed, *const () is a common pointer), but what I'm specifically talking about is whether taking a reference to a ZST ever creates a pointer inside its struct at all. Yeah, the lifetime rules seem to imply that, but if let value = (); takes up zero memory and does literally nothing, then how can I be certain that &value actually refers to a location on the stack at all?

Like, since ZST boxes all point to the same dangling value and ZST statics all point to the same dangling value, it doesn't seem to unreasonable that all ZST references would just be the same pointer (modulo alignment) by default unless dereferenced.


However, `repr(Rust)` allows reordering fields, and this is not not reflected in this rule for dynamically sized fields: no matter what you do, the dynamically sized field must be at the end. This is inconsistent with the rest of the language and limits the ability of authors to reorder the fields in a more natural way, like they can for statically sized structs.

Additionally, Rust has fully committed to zero-sized fields being truly invisible to struct layout, encoding this in the definition of `repr(transparent)`. So, why are dynamically sized fields unable to be followed by zero-sized fields, or reordered among statically sized fields in a `repr(Rust)` struct?
Copy link
Member

@kennytm kennytm Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty sure the alignment of a ZST is part of the layout and is visible (and using it in #[repr(transparent)] will indeed raise E0690 "transparent struct needs at most one field with non-trivial size or alignment")

given that the Ref-level explanation assumed #[repr(transparent)] and #[repr(C)] will be using the same logic i think you just need to mention "ZST" in this RFC is restricted to (size 0, align 1) only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the reworded explanation is better: the new rules apply to both repr(Rust) and repr(transparent), but the restrictions on what ZSTs are allowed for repr(transparent) remain.

@zachs18
Copy link
Contributor

zachs18 commented Oct 19, 2024

whether taking a reference to a ZST ever creates a pointer inside its struct at all.

Yes, &struct.field will give a pointer that is inbounds of struct1.

but if let value = (); takes up zero memory and does literally nothing, then how can I be certain that &value actually refers to a location on the stack at all?

Formally, you can't assume that let value = (); exists on the stack because the stack is not a concept that exists in the context of the Rust Abstract Machine, but this is not specific to ZSTs. It might even be valid for rustc to put let value = [1, 2, 3]; into static memory, for example2. AFAICT, rustc currently never makes locals into statics, so currently let value = (); will have an address on the stack.

ZST statics all point to the same dangling value

While this would be valid, this is not how rustc currently works AFAICT. Currently static X: () = (); will have an address in the same area as other (immutable) statics, and may or may not have the same address as other statics.


For more info: see rust-lang/unsafe-code-guidelines#503

Footnotes

  1. assuming no autoderefing

  2. assuming that it is decided that non-mut locals are not mutable (which is not yet decided IIUC), and/or it is never mutated, and/or possibly other requirements

@hanna-kruppe
Copy link

I don't really understand the motivation for this change. In general, it's nice that I don't have to worry about what's the most space-efficient field order in my types (especially if it depends on type parameters) unless I already care about their layout and therefore add repr attributes. However, in the rare case when I author (or even just look at the implementation of) a type wrapping a DST, I already have to be consciously thinking about memory layout to some degree. It's especially prominent if there are other (size > 0 or align > 1) fields in the mix because the only useful way to think of such a type is "some fixed-size header data followed by the DST". Reflecting this in the source code doesn't seem unnatural or like a chore that I'd want to delegate to the compiler. For example, consider the custom Arc-likes in rowan and triomphe.

I don't want to rule out that there's code where this RFC might be nice to have. But is a "nice to have" feature worth the can of worms w.r.t. ZST addresses and the repr(C) rules? Whatever the solution to that will be, I expect it will cast some doubt on the "simplifies the language" motivation.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Oct 19, 2024

I guess that my biggest motivation for this change is specifically for the ZST change: it's extremely common to put PhantomData markers at the end of a struct, but a DST disrupts this for what feels like no reason. I don't ever need to reference these fields, so, the reference changes don't matter to me.

For repr(transparent), I feel like it'd be appropriate to just put all ZST offsets at the beginning, since the layout is supposed to be equivalent to the one non-ZST inside it. But I guess that I was overzealous with repr(C) and we really should just keep the requirement for that.

@clarfonthey
Copy link
Contributor Author

Since it seems too contentious, I've removed repr(C) and updated the reference-level explanation to allow reordering of ZSTs for repr(transparent). Since repr(transparent) is intended to treat the entire struct layout like it's the singular sized field, this makes sense, since it means that the offsets of all fields are zero.

I do apologise for kind of rushing this RFC out— it's been something that has come up a few times as being inconsistent (imho), and I kind of just wanted to start a discussion on it rather than opening a pre-RFC on internals, since it feels like said discussion wouldn't substantially impact the RFC contents.

I probably should have thought about repr(C) more and just been extra conservative to exclude it from these new rules at first, so, I've done that now.

@nikomatsakis
Copy link
Contributor

I've not read the RFC in detail, just skimmed the OP, but I do think the rules regarding field ordering are somewhat silly given that (apart from #[repr(C)], etc) we don't guarantee field order so we can opt to reorder if we want to.

That said, I'd have to review, I seem to recall that rustc at least took advantage of this rule for a much more optimized sized check -- and maybe there are other details I've forgotten about.

@clarfonthey
Copy link
Contributor Author

clarfonthey commented Oct 20, 2024

Would love to know how those checks work, since my impression of how repr(Rust) worked is that it would sort fields by alignment and then just order them from most to least alignment, then put the DST at the end if there was one. By this model, it doesn't really matter where the DST is in the list since it has to be separated out anyway, but I guess that it being the last index already is mildly useful.

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Oct 21, 2024
@RalfJung
Copy link
Member

RalfJung commented Oct 24, 2024

What about tuples? We often treat them like repr(Rust) structs, but the fact that the last field of a tuple can be resized actually has profound consequences: it means that in (T, U, V) we actually only reorder the first two fields, and the last field stays last. This is required to support tuple unsizing coercions. I don't think this is compatible with having unsized fields elsewhere in the tuple.

The RFC should probably state explicitly that tuples still only support the last field to be unsized, and explain why -- or explain whatever else is needed to keep tuple unsizing coercions working.


This RFC also makes error reporting harder, since currently we can simply fire off a trait query ensuring all but the last field are sized. But with this RFC we'll have to somehow fire off a query for all fields, process the results, and if more than 1 returns "might not be sized" we have to figure out a good way to report the error. That should maybe be listed as a drawback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants