-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Never patterns #3719
base: master
Are you sure you want to change the base?
RFC: Never patterns #3719
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,250 @@ | ||||||||||
- Feature Name: `never_patterns` | ||||||||||
- Start Date: 2024-10-27 | ||||||||||
- RFC PR: [rust-lang/rfcs#3719](https://github.com/rust-lang/rfcs/pull/3719) | ||||||||||
- Rust Issue: [rust-lang/rust#118155](https://github.com/rust-lang/rust/issues/118155) | ||||||||||
|
||||||||||
# Summary | ||||||||||
[summary]: #summary | ||||||||||
|
||||||||||
A `!` pattern indicates a type with no valid values. It is used to indicate an impossible case when matching an empty type in unsafe code. | ||||||||||
|
||||||||||
```rust | ||||||||||
enum Void {} | ||||||||||
unsafe fn empty_tup<T>(tup: *const (T, Void)) -> ! { | ||||||||||
unsafe { | ||||||||||
match *tup { ! } | ||||||||||
} | ||||||||||
} | ||||||||||
``` | ||||||||||
|
||||||||||
Note: this RFC is purely about a new pattern syntax. It does not propose changes to exhaustiveness checking or to the operational semantics of uninhabited types. | ||||||||||
|
||||||||||
|
||||||||||
# Motivation | ||||||||||
[motivation]: #motivation | ||||||||||
|
||||||||||
Rust's unsafe semantics are access-based: the validity of data only matters when reading from/writing to a place. When pattern-matching, patterns only access the data (discriminants/values) they need to choose the right arm. | ||||||||||
|
||||||||||
Empty types are funny because the natural way of matching them is to write no arms at all: | ||||||||||
|
||||||||||
```rust | ||||||||||
fn empty_tup<T>(tup: (T, !)) -> ! { | ||||||||||
match tup {} | ||||||||||
} | ||||||||||
``` | ||||||||||
|
||||||||||
Here the absence of an arm plays the role of a "read discriminant" kind of operation. This is fine in this case, but around unsafe code that interacts with possibly-uninitialized data, accesses should be explicitly visible in the pattern. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
|
||||||||||
Today, when matching empty types inside places that may contain uninitialized data, rust requires a dummy match arm: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
```rust | ||||||||||
enum Void {} | ||||||||||
unsafe fn empty_tup<T>(tup: *const (T, Void)) -> ! { | ||||||||||
unsafe { | ||||||||||
match *tup { | ||||||||||
_ => unreachable_unchecked(), | ||||||||||
// or | ||||||||||
(_, x) => match x {} | ||||||||||
Comment on lines
+44
to
+46
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
The first example seems odd and orthogonal to the RFC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be what people commonly write, but the intent of this PR is better described by |
||||||||||
} | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
union MyUnion<T: Copy> { | ||||||||||
uninit: (), | ||||||||||
value: (T, !), | ||||||||||
} | ||||||||||
impl<T: Copy> MyUnion<T> { | ||||||||||
unsafe fn assume_init(self) -> ! { | ||||||||||
match self { | ||||||||||
MyUnion { value: (_, x) } => x, | ||||||||||
} | ||||||||||
} | ||||||||||
} | ||||||||||
``` | ||||||||||
|
||||||||||
This RFC proposes a new `!` pattern that works as an explicit "access data" or "assert validity" operation for uninhabited types. The examples above become: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this line is the key point. It defines what the programmer means when they write However, in safe code (eg, when the scrutinee is a safe reference), this fact - "this is actually sane" - is something that should always be true and shouldn't need to be explained by the programmer. ISTM that this is also a pedagogical problem. If this is needed in safe Rust, auithors of that safe Rust will need an explanation what this thing is for and what it means, but we definitley don't want to burden them thoughts about validity/safety invariants etc. Nor do we want to suiggest to them in discussion of Therefore,
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we want to make syntax depend on whether something is syntactically inside or outside an unsafe block. That kind of context-dependence can become quite surprising. So if we do anything like this, IMO it should be fully based on the type of the scrutinee place (including the types of its subexpressions, e.g. whether a union field projection is involved). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But just to check, would you consider a possible path forward looking something like:
I think such a lint being location-sensitive would be justifiable in that it doesn't change syntax, but just adds extra checks? Or at least Niko was happy with that in 2018. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW the path forward I intend to propose is simply to allow omitting arms for Unsafe code authors that do that anyway can write never patterns for such cases (they will not be marked as unreachable), and we may add a lint that requires them if users find this useful. I have moved away from explaining things in terms of "auto-never" out of personal preference, but you could explain this by saying that we always auto-never except inside pointers and unions.
This is indeed precisely the intent of this here RFC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given IOW, this should depend on whether the matched-on place is a safe place or not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed, I include this in the exception that "we don't auto-never inside pointers". By "inside pointers" I mean in terms of places. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Okay, understood - so in your view auto-never behaviour (if/when we have it or an equivalent) would need to vary be place, and shouldn't just be a static property of the matched type. But never patterns, as syntax, should be place invariant. This makes sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I may now stop discussion of auto-never and similar considerations: I'd like to keep the discussions on-topic of the never pattern syntax proposal, and discuss changes to which arms can be omitted etc to some other place. For example rust-lang/rust#131452. |
||||||||||
|
||||||||||
```rust | ||||||||||
enum Void {} | ||||||||||
unsafe fn empty_tup<T>(tup: *const (T, Void)) -> ! { | ||||||||||
unsafe { | ||||||||||
match *tup { ! } | ||||||||||
} | ||||||||||
} | ||||||||||
|
||||||||||
union MyUnion<T: Copy> { | ||||||||||
uninit: (), | ||||||||||
value: (T, !), | ||||||||||
} | ||||||||||
impl<T: Copy> MyUnion<T> { | ||||||||||
unsafe fn assume_init(self) -> ! { | ||||||||||
match self { MyUnion { value: (_, !) } } | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
} | ||||||||||
} | ||||||||||
``` | ||||||||||
|
||||||||||
<!-- | ||||||||||
```rust | ||||||||||
enum E { A(i32), B(i32, !) } | ||||||||||
let result: *const E = some_function(); | ||||||||||
unsafe { | ||||||||||
let x = match *result { | ||||||||||
A(x) => x, | ||||||||||
// An arm is required (https://github.com/rust-lang/unsafe-code-guidelines/issues/443) but `!` asserts validity of the `!` data so the arm doesn't need a body. | ||||||||||
B(_, !), | ||||||||||
}; | ||||||||||
// Alternatively: | ||||||||||
let (A(x) | B(_, !)) = *result; | ||||||||||
} | ||||||||||
``` | ||||||||||
```rust | ||||||||||
enum E { A(i32), B(i32, !) } | ||||||||||
let val: *mut E = Box::leak(Box::new(E::A(42))); | ||||||||||
// It may be possible to partially initialize this to the `B` variant | ||||||||||
// in the future. | ||||||||||
unsafe { set_discriminant_to_b(val) }; | ||||||||||
unsafe { | ||||||||||
match *val { | ||||||||||
A(x) => { ... } | ||||||||||
// This branch would then be reached without UB. | ||||||||||
B(..) => println!("Reachable!"), | ||||||||||
} | ||||||||||
} | ||||||||||
// It may be possible to construct a `&!` pointing to | ||||||||||
// uninitialized data (unsafe, but valid). | ||||||||||
let never_ref: &! = ...; | ||||||||||
let result: Result<u8, &!> = Err(never_ref); | ||||||||||
match result { | ||||||||||
Ok(x) => { ... } | ||||||||||
// This branch would then be reached without UB. | ||||||||||
Err(_) => println!("Reachable!"), | ||||||||||
} | ||||||||||
``` | ||||||||||
```rust | ||||||||||
enum Void {} | ||||||||||
let result: &Result<T, Void> = ...; | ||||||||||
let x = match result { | ||||||||||
Ok(x) => x, | ||||||||||
Err(!), // No need for an arm body | ||||||||||
}; | ||||||||||
// Or even | ||||||||||
let (Ok(x) | Err(!)) = result; | ||||||||||
``` | ||||||||||
--> | ||||||||||
|
||||||||||
|
||||||||||
# Guide-level explanation | ||||||||||
[guide-level-explanation]: #guide-level-explanation | ||||||||||
|
||||||||||
Patterns can be used on a partially uninitialized place; the basic rule is that a pattern only accesses data that is directly mentioned in the pattern: an `Ok(..)` pattern requires a discriminant read, a binding requires a full read, a wildcard `_` accesses nothing etc. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There are also |
||||||||||
|
||||||||||
For uninhabited types (types with no valid values), normal patterns cannot be used to indicate an access since there is no data to be accessed. For this purpose, you can use the special `!` pattern. | ||||||||||
|
||||||||||
The `!` pattern is allowed on any uninhabited type (such as `enum Void {}` or `(Result<!, !>, u32)`) and does an access to the underlying data. Since there can be no such data (on pain of UB), the corresponding arm is known to be unreachable and does not need a body: | ||||||||||
```rust | ||||||||||
enum Void {} | ||||||||||
unsafe fn empty_tup<T>(tup: *const (T, Void)) -> ! { | ||||||||||
unsafe { | ||||||||||
match *tup { ! } | ||||||||||
} | ||||||||||
} | ||||||||||
``` | ||||||||||
|
||||||||||
|
||||||||||
# Reference-level explanation | ||||||||||
[reference-level-explanation]: #reference-level-explanation | ||||||||||
|
||||||||||
We add `!` to the syntax of patterns. A `!` pattern is accepted for any type that is visibly uninhabited (i.e. uninhabited taking into account private fields and `#[non_exhaustive]` annotations) from the item containing the match expression. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are many possible definitions of "uninhabited" so it'd be good to spell this out. E.g., is What I would expect is that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh yeah, I had not considered the safety version of uninhabitedness. This does feel like the appropriate notion to use, I'll change that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just to clarify, if we use this wider definition of "uninhabited" for when the Such syntax was used in this comment from Ralf so I assume this might be the intention? fn abc(value: Option<&!>) {
match value {
None => {},
!
}
} I rather like this, but just to flag that it is a change of behaviour from:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. More generally, would the To create an explicit example, would the following compile, on the justification that:
enum Empty {}
enum MultiNever<T> {
A(T),
B(Empty),
C(Empty),
}
fn abc(value: MultiNever<&MultiNever<u32>>) -> u32 {
match value {
MultiNever::A(MultiNever::A(value)) => *value,
!
}
} If this does compile, I'm not sure if this necessarily fits with the motivation that "accesses should be explicitly visible in the pattern." - but possibly I've got the wrong end of the stick? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just revisiting my musing in the last comment, I missed that the motivation in the RFC actually says more fully that "around unsafe code that interacts with possibly-uninitialized data, accesses should be explicitly visible in the pattern" - which seems right to me. The motivation only cares about accesses which could cause UB being explicitly visible in the pattern, or slightly more loosely that "accesses which could cause UB without breaking a safety invariant should be explicit". And so Ralf's justification is that, references have a safety invariant to point at valid data, and as such have a recursively-safe safety invariant. Therefore it should be okay to effectively view a never pattern as:
By contrast, e.g. pointers or union fields do not have such a safety invariant, so would not satisfy such a recursive safety-based definition of uninhabitedness, and would need to be handled explicitly. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When I said "I chose not to allow what you suggest", I meant specifically things like: let foo: *const Option<!> = ...;
match *foo {
None => ...,
!
} I prefer that the user write There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the clarification! Apologies - I think my confusion stemmed from my incorrectly conflating (A) whether So just to clarify, I believe that under this RFC, the first 3 compile, but the fourth one does not: enum Empty {}
enum MyEnum<T, T2> {
A(T),
B(T2),
C(Empty),
}
fn fn1(value: MyEnum<&Empty, Empty>) -> u32 {
match value { ! }
}
fn fn2(value: MyEnum<MyEnum<Empty, &(u64, Empty)>, Empty>) -> u32 {
match value { ! }
}
fn fn3(value: MyEnum<&MyEnum<u32, Empty>, &MyEnum<Empty, Empty>>) -> u32 {
match value {
MultiNever::A(a) => match a {
MultiNever::A(out) => *out,
MultiNever::B(!),
MultiNever::C(!),
}
MultiNever::B(!),
MultiNever::C(!),
}
}
// DOES NOT COMPILE - no support for wildcard !
fn fn4(value: MyEnum<u32, Empty>) -> u32 {
match value {
MyEnum::A(a) => a,
!
}
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Good point! Well, in the spirit of discussion, let me propose for debate the "wildcard !" or more precisely the "safety-uninhabitable binding" definition of the never pattern. Such a pattern would:
If this is well defined (and, to be fair, that's a big if), it appears to me that it's a strictly more powerful extension to this proposal - in the sense that it is allowed in a superset of places. I was also wondering it it were too powerful, but I think on balance that it's likely OK. A It has benefits that it can mostly solve the ergonomics issues in this issue without auto-never, by only requiring a user to add a Users in unsafe land could be more careful, and use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
correct!
I believe this is well-defined, and agree that this would be allowed in strictly more places than the RFC currently proposes.
If that's the main motivation I'd much rather have auto-never, because I'd much rather avoid safe users needing to add I tried to separate this RFC from auto-never discussions but maybe they're too entangled for that 🤔. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree that we need an auto-never equivalent... As you've said in the comment below, maybe we need to explore that first. But I think a more flexible never pattern could theoretically be beneficial to people writing unsafe too. For example, in these quite hypothetical scenarios:
Whilst I'm naturally drawn to neat generalisations, I accept that the use case argument isn't super strong. That said, (ignoring the implementation difficulty argument for now) I feel a good argument against a more general / flexible pattern should be motivated in why a restriction makes sense. For example:
|
||||||||||
|
||||||||||
A pattern that contains a `!` is called a _never pattern_. For or-patterns, all alternatives must be never patterns for the whole pattern to be too. For example `Ok(x) | Err(!)` is not a never pattern but `(_, !)` and `Ok((_, !)) | Err(!)` are. | ||||||||||
|
||||||||||
A never pattern is accepted as an arm in a match expression, and that arm takes no body nor guards. | ||||||||||
|
||||||||||
```rust | ||||||||||
let x: *const Option<(u32, (u32, !))> = ...; | ||||||||||
match *x { | ||||||||||
None => { ... } | ||||||||||
Some((_, !)), | ||||||||||
} | ||||||||||
``` | ||||||||||
|
||||||||||
A never pattern is also accepted in other places a pattern would be (`if let`, destructuring `let`, etc.), and makes code that corresponds to that branch unreachable. | ||||||||||
|
||||||||||
```rust | ||||||||||
enum Void {} | ||||||||||
impl Void { | ||||||||||
fn ex_falso<T>(!: Self) -> T {} | ||||||||||
} | ||||||||||
``` | ||||||||||
|
||||||||||
In terms of both semantics and exhaustiveness checking, a `!` pattern behaves like a binding. E.g. a `Some(!),` arm (absent match ergonomics) is semantically equivalent to `Some(x) => match x {}`. Indeed, they both indicate a load at a place of an uninhabited type, followed by an unreachable body. | ||||||||||
|
||||||||||
Never patterns interact with match ergonomics as expected: a `!` pattern is allowed on e.g. `&(T, !)`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would have expected There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we do go with the "safety invariant" version of inhabitedness then indeed this sentence isn't useful anymore. Will update when I get a moment. |
||||||||||
|
||||||||||
A never pattern may be linted as "unreachable" if there is no well-behaved execution that can reach it. For example, if it was guaranteed that reading the discriminant of `Result<T, !>` could never return `Err`, the `Err(!)` pattern would be linted as unreachable. Never patterns are not otherwise linted as unreachable even when removing them would be accepted. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we view this as necessary? Under the psuedo-definition of ! in this comment, a never pattern against 0 types is trivially okay. Some benefits of permitting an uneeded
A drawback could be that in this case, if we see There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is purely a convenient lint for users and helps learnability, exactly like today's "unreachable arm" lint. Macros can (and often do) silence lints to get more flexibility, so there'd be no win here. Similarly, the auto-never desugaring you have in mind would work the same; we wouldn't emit a lint there since that would make no sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That makes sense. My comment was mostly applying to a world where we had wildcard All that said, I think it's fine for this RFC to not cover that, and to need to wait for something equivalent to auto-never for this to be a "nice" story for macro writers. |
||||||||||
|
||||||||||
# Drawbacks | ||||||||||
[drawbacks]: #drawbacks | ||||||||||
|
||||||||||
The main drawback is that this new pattern is unlike other patterns (e.g. it doesn't require an arm body) which adds some maintenance burden. It's otherwise a simple feature. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How bad would it be to this RFC if the "feature" of skipping the arm body be removed 🤷 Save some major changes to the syntax that messes with macros and Just type-check that the arm body always has type match x {
// note: using `const { unreachable!() }` instead of only `unreachable!()`
// to tell the compile actually enforce that the arm is never invoked
Some(!) => const { unreachable!() },
None => "ok",
} TBH it is very strange that only the if let Some(!) = x {
// you still need a body here! otherwise you get "E0308 mismatched types"!
const { unreachable!() }
} else {
"ok"
} (I don't think the "empty block" proposal in #3719 (comment) works because an empty block does not indicate the block is actually unreachable. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The other syntax that has an equivalent is or-patterns: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we don't want the "bodiless arm" syntax, I'd propose that we allow this: match x {
Some(!) => {}
None => "ok",
} this would consider the whole arm body to diverge for the purposes of type-checking, similar to how we typecheck enum Void {}
fn foo(!: Void) -> u32 {} // typechecks because the body diverges There's one case where the "bodiless arm" is particularly nice though: the empty match. Today There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The specialty of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
how so? if you're trying to have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @programmerjake sure we can always adjust the parser to disambiguate, but I think introducing a whole new syntax of expression with special handling (like
Hmm it does not typecheck? #![feature(never_type)]
fn f(a: Option<&!>) -> &str {
match a {
Some(_) => { return ""; () },
// error[E0308]: mismatched types
// expected `&str`, found `()`
None => "ok",
}
}
I suppose that means the #![feature(never_patterns)]
#![feature(never_type)]
fn g(a: Option<&!>) -> &str {
if let Some(!) = a {
// error[E0308]: mismatched types
// expected `&str`, found `()`
} else {
"ok"
}
}
I don't think the need for never-patterns is common enough to justify changing the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
ok, since There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Fair. I was hoping we could allow it just for this special case but idk if that's worth it.
Yep There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My slight concern with a syntax like I haven't thought too much about this, but would something like And would it possible to reject alternative arms somehow at compile time? e.g. require |
||||||||||
|
||||||||||
|
||||||||||
# Rationale and alternatives | ||||||||||
[rationale-and-alternatives]: #rationale-and-alternatives | ||||||||||
|
||||||||||
Status quo alternative: we could do nothing, and consider that writing `Err(_) => unreachable!()` or `Err(x) => match x {}` is good enough. The drawback is more noticeable for or-patterns, e.g. `let (Ok(x) | Err(!)) = ...;` would have to become a `let .. else` or a full `match`. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the first time the RFC mentions that |
||||||||||
|
||||||||||
Alternatively, we could allow omitting all empty arms; this is what the unstable `exhaustive_patterns` feature does today. A match like | ||||||||||
```rust | ||||||||||
match ... { | ||||||||||
Ok(_) => ..., | ||||||||||
} | ||||||||||
``` | ||||||||||
would implicitly contain a `Err(!)` arm, i.e. would trigger UB if the discriminant is `Err`. This was deemed a footgun and was the original motivation for this feature. | ||||||||||
|
||||||||||
|
||||||||||
# Prior art | ||||||||||
[prior-art]: #prior-art | ||||||||||
|
||||||||||
This proposal is entirely based on @nikomatsakis and @RalfJung's original idea, presented in Niko's [blog post](https://smallcultfollowing.com/babysteps/blog/2018/08/13/never-patterns-exhaustive-matching-and-uninhabited-types-oh-my/). | ||||||||||
|
||||||||||
The only other language I (Nadrieril) am aware of that has the intersection of features that would make this necessary is Zig. They are in the process of clarifying the semantics of their empty types (https://github.com/ziglang/zig/issues/15909); they may or may not end up encountering the same explicitness problem as us. | ||||||||||
|
||||||||||
Ocaml and Adga have respectively [refutation cases](https://ocaml.org/manual/5.2/gadts-tutorial.html#s:gadt-refutation-cases) and [absurd patterns](https://agda.readthedocs.io/en/latest/language/function-definitions.html#absurd-patterns), both of which give a way to say "this pattern is impossible and I shouldn't have to write an arm for it". Their flavor of impossibility is related to types however, not runtime validity. | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't understand this remark. Ours is also all about types? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm yes this is badly phrased. I'm trying to express something like "in their case, the absurd pattern is purely a type-system hint, whereas in our case changing |
||||||||||
|
||||||||||
Something that's suprisingly semantically (and syntactically!) close is Haskell's [bang patterns](https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/strict.html#bang-patterns-informal). Haskell is a lazy language and a pattern only evaluates the underlying value if needed to decide which branch to take. A `!x` instead pattern forces the evaluation of the underlying value. Rust's patterns are a bit lazy too: they lazily assert validity of the data they access. In Haskell one must be careful to evaluate the right things in the right order else you may diverge; in unsafe Rust similar care is needed to avoid UB. | ||||||||||
|
||||||||||
Never patterns could then be described as playing the role of both a bang pattern ("please evaluate this") and an absurd pattern ("this is an impossible case") at the same time. | ||||||||||
|
||||||||||
From [the lazy patterns literature](https://dl.acm.org/doi/abs/10.1145/3408989) we can also take the distinction between a "redundant" pattern (can be removed without changing semantics), and an "inaccessible" pattern (will never execute but cannot be removed without changing semantics). E.g., in rust: | ||||||||||
```rust | ||||||||||
let ptr: *const ! = ...; | ||||||||||
match *ptr { | ||||||||||
// This causes a read of the value, hence UB if reached. Removing it would remove UB. Therefore this is inaccessible and not redundant. | ||||||||||
_x => {} | ||||||||||
_ => {} | ||||||||||
} | ||||||||||
``` | ||||||||||
Never patterns are our way to notate inaccessible patterns. | ||||||||||
|
||||||||||
|
||||||||||
# Unresolved questions | ||||||||||
[unresolved-questions]: #unresolved-questions | ||||||||||
|
||||||||||
- Does this cause problems with macros-by-example? | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. It also causes problems for derive macros. As I write,
This isn't just a problem for macro_rules macros. Derive macros can suffer from it too. (Users of derive-deftly or other approaches for making derive macros more easily are quite likely to run into this.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are two separate issues here: rust-lang/rust#131452 is entirely independent from this RFC, since this RFC is purely about adding a new pattern syntax (and not about changing anything else in the language). The other issue with macros is that a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think it would be a breaking change for libraries like syn, in the sense that all procedural macro crates would need to update to the latest Given that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good point, they'd have to majorly change their There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is already causing problems like: fn h(a: Option<&!>) -> bool {
matches!(a, Some(!))
// error: a never pattern is always unreachable
} |
||||||||||
- Should the MIR-lowering of a never pattern actually include a place read or is a branch to `Unreachable` good enough? | ||||||||||
- Out of scope: opsem and exhaustiveness decisions. | ||||||||||
|
||||||||||
|
||||||||||
# Future possibilities | ||||||||||
[future-possibilities]: #future-possibilities | ||||||||||
|
||||||||||
This is a pretty self-contained feature. | ||||||||||
|
||||||||||
This RFC is part of the [Patterns of Empty Types](https://rust-lang.github.io/rust-project-goals/2024h2/Patterns-of-empty-types.html) project goal. In parallel with this, I plan to propose that we allow omitting empty arms behind references (e.g. `let Ok(_) = expr;` should be allowed for `expr: Result<T, &!>`). That way we'd get: for safe places (places that can be accessed in safe code), you can omit an empty arm; for unsafe places, you must write an arm and never patterns make this convenient. | ||||||||||
|
||||||||||
This feature is closely related to the [`never_type`](https://github.com/rust-lang/rust/issues/35121) initiative, which aims to provide `!` as the default empty type in the language. These initiatives are independent and complement each other. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust's unsafe semantics are actually mostly not access-based. Constructing an invalid value is UB.
Only behind raw pointers and inside union fields do we use access-based semantics. So the text here is quite misleading as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Niko's blog post from year 2018, linked in the RFC text, says that
...mentioning about
&!
being possibly a valid value. Just to clarify, I presume this is now stale information and having a type&!
is considered UB? Also, the same article presents following code:Is that accepted by modern standards? Seems a bit sketchy that the union field is already accessed in the match expression, so the fields are not matched against inside the match. By extension, one should be able to do the following, which, by modern standards, is surely UB, right? (To be sure, the RFC presents code that matches against the union fields inside the match.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A match expression works on places, not values. So the two examples you show are not equivalent! The first never turns
x.value
into a value, it only ever accesses its first field.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But values don't exist in memory, they only exist while reading or writing, right? And matches don't access compound values wholesale, they only inspect memory piecewise. In particular I can mess up some bytes of a place and still match on it if the match only inspects the non-messed-up bytes.
Is that not "access-based"? Am I incorrect in the above understanding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an access-based semantics, it is generally expected that e.g.
is fine if we never "access"
b
. So I think it is misleading to call Rust's semantics access-based.But what you say about places is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Is there a term of art for the particular flavor of "validity only matters when reading/writing" that rust has?
Otherwise I don't can do without this general statement. Would this be ok: "The semantics of pattern-matching around unsafe values are access-based: patterns only access the data (discriminants/values) they need to choose the right arm. The accessed data must be valid, but the rest may be invalid or uninitialized"?