-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify UB around immutability & mutation #1385
Clarify UB around immutability & mutation #1385
Conversation
cc @RalfJung Do you have a chance to review this? |
Hm... not sure. This includes things which I would say are commentary, and already implied by the previous sentence which is the actual spec (such as the remark about padding, or the last sentence about writes not modifying the byte's contents). We should at least clearly separate comments from spec -- we don't want to leave the impression that the sentence "Writes which do not modify the byte contents are still mutations" changes the definition of "mutation". Also, "bytes of a value pointed to by a shared ref" is IMO even less clear than the previous formulation. What about just saying "bytes pointed to by a shared ref"? Then IMO the sentence about padding becomes unnecessary. If anything we might want to define "pointed to" -- which we actually kind of do define later
When talking about bytes, there isn't even such a thing as "padding bytes". They only exist when talking about values and how those relate to their byte representation. By rephrasing the entire definition in terms of bytes, I don't think we still need the padding comment -- it is redundant. |
Yeah that's a good question. Maybe add something like "all bytes pointed to by a shared reference, including transitively through other references (shared and mutable) and Boxes, including those stored in fields of compound types"? This isn't even the bullet point about the aliasing model, but I think the old "reached" was intended to capture this transitivity. (The term goes back all the way to the initial commit, so it's hard to tell.) |
I originally had these as notes, but I was worried about over-bulking the section. I can move the unchanged-value addendum back to a note, but I'm certain it's surprising enough to include explicitly. The padding note can be removed when moving to the byte-based definition; but it's worth dropping any mention of "data" then, since I think it's confusing (if bytes are what really matter.)
I will add that. |
This is part of the feedback on rust-lang#1385. Ralf made the point that the immutability definition could be restated solely in terms of bytes, which has the added benefit of no longer requiring the note on padding (since it's a natural consequence of the byte version.) The new wording for shared references also clarifies the case of mutable references behind shared ones, and reintroduces some of the transitivity property that I removed in my previous commit. The wording is separate from that for immutable bindings, since those don't have transitive immutability. This also bumps the definition of bytes pointed to by references and pointers into its own subsection, so that it can be linked to by the UB definition, to avoid duplication. Co-authored-by: Ralf Jung <[email protected]>
I personally found this description of UB confusing, since the use of "reached" suggests that UB only happens for read bytes, and the definition of immutability is not given, allowing for multiple interpretations: does the "data" have to be immutable from the first read? From the creation of the reference? Between reads from the immutable accessor, but not otherwise? etc. This clarifies the actual UB conditions, based on this Zulip interaction: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/What.20exactly.20are.20.22immutable.22.20and.20.22reached.22.20in.20shared.20ref.20UB.3F and this reference discussion: rust-lang#1227 in two ways: * The definition of "data" is clarified to be stated in terms of bytes, in a way that should avoid ambiguity about which bytes are considered. Based on the GH issue, this clarification should also allow for use of a `*mut` pointer through a shared reference, which is not in itself UB. Based on the Zulip issue, the definition includes padding bytes, which may be surprising. * The definition of immutability & mutation for a set of bytes is clarified to mean forbidding *all* non-0-byte writes.
This is part of the feedback on rust-lang#1385. Ralf made the point that the immutability definition could be restated solely in terms of bytes, which has the added benefit of no longer requiring the note on padding (since it's a natural consequence of the byte version.) The new wording for shared references also clarifies the case of mutable references behind shared ones, and reintroduces some of the transitivity property that I removed in my previous commit. The wording is separate from that for immutable bindings, since those don't have transitive immutability. This also bumps the definition of bytes pointed to by references and pointers into its own subsection, so that it can be linked to by the UB definition, to avoid duplication. Co-authored-by: Ralf Jung <[email protected]>
8c18a31
to
70886e3
Compare
These changes should preserve the meaning of the contents. Co-authored-by: Ralf Jung <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ivanbakel and @RalfJung!
Update books ## rust-lang/reference 7 commits in 1ea0178266b3f3f613b0fabdaf16a83961c99cdb..9cd5c5a6ccbd4c07c65ab5c69a53286280308c95 2023-07-29 22:29:51 UTC to 2023-07-16 20:12:46 UTC - Fix merge queue building twice. (rust-lang/reference#1383) - Clarify UB around immutability & mutation (rust-lang/reference#1385) - mention the extra const UB (rust-lang/reference#1273) - Operator expressions: make the note about division by zero clearer. (rust-lang/reference#1384) - Make unsafe keyword docs less confusing (rust-lang/reference#1379) - Say that division by zero for primitive types panics (rust-lang/reference#1382) - Add CI trigger for merge queues. (rust-lang/reference#1381) ## rust-lang/rust-by-example 3 commits in 8a87926a985ce32ca1fad1be4008ee161a0b91eb..07e0df2f006e59d171c6bf3cafa9d61dbeb520d8 2023-07-24 11:37:55 UTC to 2023-07-24 11:35:36 UTC - Added attribute unused_labels - fixed warning. (rust-lang/rust-by-example#1729) - more explanation about panic (rust-lang/rust-by-example#1728) - chore: add the portuguese version of this project to `readme.md` (rust-lang/rust-by-example#1727) ## rust-lang/rustc-dev-guide 26 commits in b5a12d95e32ae53791cc6ab44417774667ed2ac6..24eebb6df96d037aad285e4f7793bd0393a66878 2023-07-30 11:23:23 UTC to 2023-07-11 06:02:34 UTC - fix(name-resolution): remove unnecessary closing paranthesis (rust-lang/rustc-dev-guide#1760) - fix(macro-expansion.md): fix the article `an` to `a` (rust-lang/rustc-dev-guide#1759) - fix(serialization.md): fix the name of a derive macro (rust-lang/rustc-dev-guide#1756) - fix(serialization.md): add a necessary plural suffix (rust-lang/rustc-dev-guide#1757) - fix(salsa.md): add punctuation to prevent confusion (rust-lang/rustc-dev-guide#1754) - fix(salsa.md): remove duplicate "To Be" verb (rust-lang/rustc-dev-guide#1755) - feat(fuzzing.md): make `halfempty` word a link (rust-lang/rustc-dev-guide#1750) - fix(about.md): use `a` instead of `an` (rust-lang/rustc-dev-guide#1751) - refactor(git.md): make git-scm links clickable (rust-lang/rustc-dev-guide#1747) - fix(walkthrough.md) add a comma operator to eliminate ambiguity (rust-lang/rustc-dev-guide#1749) - fix(git.md): remove a confusing end of sentence character (rust-lang/rustc-dev-guide#1748) - refactor(profiling/with_perf): remove a wrong to be verb (rust-lang/rustc-dev-guide#1746) - refactor(tests/headers): remove duplicate list item (rust-lang/rustc-dev-guide#1745) - refactor(test/headers.md): make the meaning more obvious (rust-lang/rustc-dev-guide#1744) - refactor(tests/ui): remove unnecessary duplicate word (rust-lang/rustc-dev-guide#1743) - refactor(compiletest): remove unnecessary duplicate word (rust-lang/rustc-dev-guide#1742) - generic_arguments.md: substs -> GenericArgs (rust-lang/rustc-dev-guide#1741) - fix(suggested): remove an unnecessary and confusing statement (rust-lang/rustc-dev-guide#1739) - fix(how-to-build-and-run): fix a typo ("fromer" -> "former") (rust-lang/rustc-dev-guide#1736) - fix(how-to-build-and-run): remove a wrong paragraph (rust-lang/rustc-dev-guide#1735) - coverage code has moved (rust-lang/rustc-dev-guide#1728) - linked issue is closed (rust-lang/rustc-dev-guide#1729) - remove duplicate reference in about-this-guide.md (rust-lang/rustc-dev-guide#1734) - Explain more in depth what early and late bound generic parameters are (rust-lang/rustc-dev-guide#1732) - add section for normalization with the new solver (rust-lang/rustc-dev-guide#1731) - Improve cleanup-crew.md with an example post (rust-lang/rustc-dev-guide#1730)
Update books ## rust-lang/book 3 commits in 668c64760b5c7ea654facb4ba5fe9faddfda27cc..72187f5cd0beaaa9c6f584156bcd88f921871e83 2023-08-04 14:42:07 UTC to 2023-08-03 13:36:44 UTC - redirects: change link for the `#![no_std]` tutorial (rust-lang/book#3705) - [chpt10.2] - Small wording changes (rust-lang/book#3724) - Improve sentence (rust-lang/book#3725) ## rust-embedded/book 1 commits in 1e5556dd1b864109985d5871616ae6b9164bcead..99ad2847b865e96d8ae7b333d3ee96963557e621 2023-08-11 06:31:04 UTC to 2023-08-11 06:31:04 UTC - Fix a small typo in qemu.md (rust-embedded/book#359) ## rust-lang/nomicon 1 commits in 302b995bcb24b70fd883980fd174738c3a10b705..388750b081c0893c275044d37203f97709e058ba 2023-08-10 21:15:21 UTC to 2023-08-10 21:15:21 UTC - Document thiscall abi (rust-lang/nomicon#311) ## rust-lang/reference 10 commits in 1ea0178266b3f3f613b0fabdaf16a83961c99cdb..d43038932adeb16ada80e206d4c073d851298101 2023-08-12 19:07:28 UTC to 2023-07-16 20:12:46 UTC - Document thiscall abi (rust-lang/reference#1092) - add section about implied bounds (rust-lang/reference#1261) - Clearly specify the `instruction_set` effects (rust-lang/reference#1307) - Fix merge queue building twice. (rust-lang/reference#1383) - Clarify UB around immutability & mutation (rust-lang/reference#1385) - mention the extra const UB (rust-lang/reference#1273) - Operator expressions: make the note about division by zero clearer. (rust-lang/reference#1384) - Make unsafe keyword docs less confusing (rust-lang/reference#1379) - Say that division by zero for primitive types panics (rust-lang/reference#1382) - Add CI trigger for merge queues. (rust-lang/reference#1381) ## rust-lang/rust-by-example 3 commits in 8a87926a985ce32ca1fad1be4008ee161a0b91eb..07e0df2f006e59d171c6bf3cafa9d61dbeb520d8 2023-07-24 11:37:55 UTC to 2023-07-24 11:35:36 UTC - Added attribute unused_labels - fixed warning. (rust-lang/rust-by-example#1729) - more explanation about panic (rust-lang/rust-by-example#1728) - chore: add the portuguese version of this project to `readme.md` (rust-lang/rust-by-example#1727) ## rust-lang/rustc-dev-guide 31 commits in b5a12d95e32ae53791cc6ab44417774667ed2ac6..b123ab4754127d822ffb38349ce0fbf561f1b2fd 2023-08-14 08:34:59 UTC to 2023-07-11 06:02:34 UTC - fix: stabilize debugger_visualizer (rust-lang/rustc-dev-guide#1766) - feat(part-5-intro): make "Part 5" obvious (rust-lang/rustc-dev-guide#1753) - Update from `#[warn_]` to `#[warning]` diagnostic attributes (rust-lang/rustc-dev-guide#1765) - Add RPITIT documentation (rust-lang/rustc-dev-guide#1764) - fix(visitor.md): fix a type name in a code sample (rust-lang/rustc-dev-guide#1762) - fix(name-resolution): remove unnecessary closing paranthesis (rust-lang/rustc-dev-guide#1760) - fix(macro-expansion.md): fix the article `an` to `a` (rust-lang/rustc-dev-guide#1759) - fix(serialization.md): fix the name of a derive macro (rust-lang/rustc-dev-guide#1756) - fix(serialization.md): add a necessary plural suffix (rust-lang/rustc-dev-guide#1757) - fix(salsa.md): add punctuation to prevent confusion (rust-lang/rustc-dev-guide#1754) - fix(salsa.md): remove duplicate "To Be" verb (rust-lang/rustc-dev-guide#1755) - feat(fuzzing.md): make `halfempty` word a link (rust-lang/rustc-dev-guide#1750) - fix(about.md): use `a` instead of `an` (rust-lang/rustc-dev-guide#1751) - refactor(git.md): make git-scm links clickable (rust-lang/rustc-dev-guide#1747) - fix(walkthrough.md) add a comma operator to eliminate ambiguity (rust-lang/rustc-dev-guide#1749) - fix(git.md): remove a confusing end of sentence character (rust-lang/rustc-dev-guide#1748) - refactor(profiling/with_perf): remove a wrong to be verb (rust-lang/rustc-dev-guide#1746) - refactor(tests/headers): remove duplicate list item (rust-lang/rustc-dev-guide#1745) - refactor(test/headers.md): make the meaning more obvious (rust-lang/rustc-dev-guide#1744) - refactor(tests/ui): remove unnecessary duplicate word (rust-lang/rustc-dev-guide#1743) - refactor(compiletest): remove unnecessary duplicate word (rust-lang/rustc-dev-guide#1742) - generic_arguments.md: substs -> GenericArgs (rust-lang/rustc-dev-guide#1741) - fix(suggested): remove an unnecessary and confusing statement (rust-lang/rustc-dev-guide#1739) - fix(how-to-build-and-run): fix a typo ("fromer" -> "former") (rust-lang/rustc-dev-guide#1736) - fix(how-to-build-and-run): remove a wrong paragraph (rust-lang/rustc-dev-guide#1735) - coverage code has moved (rust-lang/rustc-dev-guide#1728) - linked issue is closed (rust-lang/rustc-dev-guide#1729) - remove duplicate reference in about-this-guide.md (rust-lang/rustc-dev-guide#1734) - Explain more in depth what early and late bound generic parameters are (rust-lang/rustc-dev-guide#1732) - add section for normalization with the new solver (rust-lang/rustc-dev-guide#1731) - Improve cleanup-crew.md with an example post (rust-lang/rustc-dev-guide#1730)
I personally found this description of UB confusing, since the use of "reached" suggests that UB only happens for read bytes, and the definition of immutability is not given, allowing for multiple interpretations: does the "data" have to be immutable from the first read? From the creation of the reference? Between reads from the immutable accessor, but not otherwise? etc.
This clarifies the actual UB conditions, based on this Zulip interaction and this reference discussion in two ways:
*mut
pointer through a shared reference, which is not in itself UB. Based on the Zulip issue, the definition includes padding bytes, which may be surprising.I also tried to avoid use of the word "reachable", since I thought it be implying a transitive property about immutability for references which (to my understanding) doesn't actually apply. This may now fail to cover mutable-references-behind-shared-references in some way, but I'll leave it to review to decide that.