Skip to content

Commit 717d0ca

Browse files
committed
Explain the important concepts of exhaustiveness checking
1 parent 904bb5a commit 717d0ca

File tree

1 file changed

+133
-5
lines changed

1 file changed

+133
-5
lines changed

Diff for: src/pat-exhaustive-checking.md

+133-5
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ are exhaustive.
77
## Pattern usefulness
88

99
The central question that usefulness checking answers is:
10-
"in this match expression, is that branch reachable?".
10+
"in this match expression, is that branch redundant?".
1111
More precisely, it boils down to computing whether,
1212
given a list of patterns we have already seen,
1313
a given new pattern might match any new value.
@@ -42,10 +42,8 @@ because a match expression can return a value).
4242

4343
## Where it happens
4444

45-
This check is done to any expression that desugars to a match expression in MIR.
46-
That includes actual `match` expressions,
47-
but also anything that looks like pattern matching,
48-
including `if let`, destructuring `let`, and similar expressions.
45+
This check is done anywhere you can write a pattern: `match` expressions, `if let`, `let else`,
46+
plain `let`, and function arguments.
4947

5048
```rust
5149
// `match`
@@ -84,5 +82,135 @@ Exhaustiveness checking is implemented in [`check_match`].
8482
The core of the algorithm is in [`usefulness`].
8583
That file contains a detailed description of the algorithm.
8684

85+
## Important concepts
86+
87+
### Constructors and fields
88+
89+
In the value `Pair(Some(0), true)`, `Pair` is called the constructor of the value, and `Some(0)` and
90+
`true` are its fields. Every matcheable value can be decomposed in this way. Examples of
91+
constructors are: `Some`, `None`, `(,)` (the 2-tuple constructor), `Foo {..}` (the constructor for
92+
a struct `Foo`), and `2` (the constructor for the number `2`).
93+
94+
Each constructor takes a fixed number of fields; this is called its arity. `Pair` and `(,)` have
95+
arity 2, `Some` has arity 1, `None` and `42` have arity 0. Each type has a known set of
96+
constructors. Some types have many constructors (like `u64`) or even an infinitely many (like `&str`
97+
and `&[T]`).
98+
99+
Patterns are similar: `Pair(Some(_), _)` has constructor `Pair` and two fields. The difference is
100+
that we get some extra pattern-only constructors, namely: the wildcard `_`, variable bindings,
101+
integer ranges like `0..=10`, and variable-length slices like `[_, .., _]`. We treat or-patterns
102+
separately.
103+
104+
Now to check if a value `v` matches a pattern `p`, we check if `v`'s constructor matches `p`'s
105+
constructor, then recursively compare their fields if necessary. A few representative examples:
106+
107+
- `matches!(v, _) := true`
108+
- `matches!((v0, v1), (p0, p1)) := matches!(v0, p0) && matches!(v1, p1)`
109+
- `matches!(Foo { a: v0, b: v1 }, Foo { a: p0, b: p1 }) := matches!(v0, p0) && matches!(v1, p1)`
110+
- `matches!(Ok(v0), Ok(p0)) := matches!(v0, p0)`
111+
- `matches!(Ok(v0), Err(p0)) := false` (incompatible variants)
112+
- `matches!(v, 1..=100) := matches!(v, 1) || ... || matches!(v, 100)`
113+
- `matches!([v0], [p0, .., p1]) := false` (incompatible lengths)
114+
- `matches!([v0, v1, v2], [p0, .., p1]) := matches!(v0, p0) && matches!(v2, p1)`
115+
116+
This concept is absolutely central to pattern analysis. The [`deconstruct_pat`] module provides
117+
functions to extract, list and manipulate constructors. This is a useful enough concept that
118+
variations of it can be found in other places of the compiler, like in the MIR-lowering of a match
119+
expression and in some clippy lints.
120+
121+
### Constructor grouping and splitting
122+
123+
The pattern-only constructors (`_`, ranges and variable-length slices) each stand for a set of
124+
normal constructors, e.g. `_: Option<T>` stands for the set {`None`, `Some`} and `[_, .., _]` stands
125+
for the infinite set {`[,]`, `[,,]`, `[,,,]`, ...} of the slice constructors of arity >= 2.
126+
127+
In order to manage these constructors, we keep them as grouped as possible. For example:
128+
129+
```rust
130+
match (0, false) {
131+
(0 ..=100, true) => {}
132+
(50..=150, false) => {}
133+
(0 ..=200, _) => {}
134+
}
135+
```
136+
137+
In this example, all of `0`, `1`, .., `49` match the same arms, and thus can be treated as a group.
138+
In fact, in this match, the only ranges we need to consider are: `0..50`, `50..=100`,
139+
`101..=150`,`151..=200` and `201..`. Similarly:
140+
141+
```rust
142+
enum Direction { North, South, East, West }
143+
# let wind = (Direction::North, 0u8);
144+
match wind {
145+
(Direction::North, 50..) => {}
146+
(_, _) => {}
147+
}
148+
```
149+
150+
Here we can treat all the non-`North` constructors as a group, giving us only two cases to handle:
151+
`North`, and everything else.
152+
153+
This is called "constructor splitting" and is crucial to having exhaustiveness run in reasonable
154+
time.
155+
156+
### Usefulness vs reachability in the presence of empty types
157+
158+
This is likely the subtlest aspect of exhaustiveness. To be fully precise, a match doesn't operate
159+
on a value, it operates on a place. In certain unsafe circumstances, it is possible for a place to
160+
not contain valid data for its type. This has subtle consequences for empty types. Take the
161+
following:
162+
163+
```rust
164+
enum Void {}
165+
let x: u8 = 0;
166+
let ptr: *const Void = &x as *const u8 as *const Void;
167+
unsafe {
168+
match *ptr {
169+
_ => println!("Reachable!"),
170+
}
171+
}
172+
```
173+
174+
In this example, `ptr` is a valid pointer pointing to a place with invalid data. The `_` pattern
175+
does not look at the contents of the place `*ptr`, so this code is ok and the arm is taken. In other
176+
words, despite the place we are inspecting being of type `Void`, there is a reachable arm. If the
177+
arm had a binding however:
178+
179+
```rust
180+
# #[derive(Copy, Clone)]
181+
# enum Void {}
182+
# let x: u8 = 0;
183+
# let ptr: *const Void = &x as *const u8 as *const Void;
184+
# unsafe {
185+
match *ptr {
186+
_a => println!("Unreachable!"),
187+
}
188+
# }
189+
```
190+
191+
Here the binding loads the value of type `Void` from the `*ptr` place. In this example, this causes
192+
UB since the data is not valid. In the general case, this asserts validity of the data at `*ptr`.
193+
Either way, this arm will never be taken.
194+
195+
Finally, let's consider the empty match `match *ptr {}`. If we consider this exhaustive, then
196+
having invalid data at `*ptr` is invalid. In other words, the empty match is semantically
197+
equivalent to the `_a => ...` match. In the interest of explicitness, we prefer the case with an
198+
arm, hence we won't tell the user to remove the `_a` arm. In other words, the `_a` arm is
199+
unreachable yet not redundant. This is why we lint on redundant arms rather than unreachable
200+
arms, despite the fact that the lint says "unreachable".
201+
202+
These considerations only affects certain places, namely those that can contain non-valid data
203+
without UB. These are: pointer dereferences, reference dereferences, and union field accesses. We
204+
track during exhaustiveness checking whether a given place is known to contain valid data.
205+
206+
Having said all that, the current implementation of exhaustiveness checking does not follow the
207+
above considerations. On stable, empty types are for the most part treated as non-empty. The
208+
[`exhaustive_patterns`] feature errs on the other end: it allows omitting arms that could be
209+
reachable in unsafe situations. The [`never_patterns`] experimental feature aims to fix this and
210+
permit the correct behavior of empty types in patterns.
211+
87212
[`check_match`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/thir/pattern/check_match/index.html
88213
[`usefulness`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/thir/pattern/usefulness/index.html
214+
[`deconstruct_pat`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/thir/pattern/deconstruct_pat/index.html
215+
[`never_patterns`]: https://github.com/rust-lang/rust/issues/118155
216+
[`exhaustive_patterns`]: https://github.com/rust-lang/rust/issues/51085

0 commit comments

Comments
 (0)