Skip to content

Commit efb04e6

Browse files
committed
Rework the explanation of relevancy
1 parent 71e8334 commit efb04e6

File tree

1 file changed

+138
-48
lines changed

1 file changed

+138
-48
lines changed

compiler/rustc_pattern_analysis/src/usefulness.rs

+138-48
Original file line numberDiff line numberDiff line change
@@ -300,71 +300,163 @@
300300
//!
301301
//!
302302
//!
303-
//! # `Missing` and relevant constructors
303+
//! # `Missing` and relevancy
304+
//!
305+
//! ## Relevant values
304306
//!
305307
//! Take the following example:
306308
//!
307309
//! ```compile_fail,E0004
310+
//! # let foo = (true, true);
311+
//! match foo {
312+
//! (true, _) => 1,
313+
//! (_, true) => 2,
314+
//! };
315+
//! ```
316+
//!
317+
//! Consider the value `(true, true)`:
318+
//! - Row 2 does not distinguish `(true, true)` and `(false, true)`;
319+
//! - `false` does not show up in the first column of the match, so without knowing anything else we
320+
//! can deduce that `(false, true)` matches the same or fewer rows than `(true, true)`.
321+
//!
322+
//! Using those two facts together, we deduce that `(true, true)` will not give us more usefulness
323+
//! information about row 2 than `(false, true)` would. We say that "`(true, true)` is made
324+
//! irrelevant for row 2 by `(false, true)`". We will use this idea to prune the search tree.
325+
//!
326+
//!
327+
//! ## Computing relevancy
328+
//!
329+
//! We now generalize from the above example to approximate relevancy in a simple way. Note that we
330+
//! will only compute an approximation: we can sometimes determine when a case is irrelevant, but
331+
//! computing this precisely is at least as hard as computing usefulness.
332+
//!
333+
//! Our computation of relevancy relies on the `Missing` constructor. As explained in
334+
//! [`crate::constructor`], `Missing` represents the constructors not present in a given column. For
335+
//! example in the following:
336+
//!
337+
//! ```compile_fail,E0004
308338
//! enum Direction { North, South, East, West }
309339
//! # let wind = (Direction::North, 0u8);
310340
//! match wind {
311-
//! (Direction::North, _) => {} // arm 1
312-
//! (_, 50..) => {} // arm 2
313-
//! }
341+
//! (Direction::North, _) => 1,
342+
//! (_, 50..) => 2,
343+
//! };
314344
//! ```
315345
//!
316-
//! Remember that we represent the "everything else" cases with [`Constructor::Missing`]. When we
317-
//! specialize with `Missing` in the first column, we have one arm left:
346+
//! Here `South`, `East` and `West` are missing in the first column, and `0..50` is missing in the
347+
//! second. Both of these sets are represented by `Constructor::Missing` in their corresponding
348+
//! column.
318349
//!
319-
//! ```ignore(partial code)
320-
//! (50..) => {} // arm 2
321-
//! ```
350+
//! We then compute relevancy as follows: during the course of the algorithm, for a row `r`:
351+
//! - if `r` has a wildcard in the first column;
352+
//! - and some constructors are missing in that column;
353+
//! - then any `c != Missing` is considered irrelevant for row `r`.
322354
//!
323-
//! We then conclude that arm 2 is useful, and that the match is non-exhaustive with witness
324-
//! `(Missing, 0..50)` (which we would display to the user as `(_, 0..50)`).
355+
//! By this we mean that continuing the algorithm by specializing with `c` is guaranteed not to
356+
//! contribute more information about the usefulness of row `r` than what we would get by
357+
//! specializing with `Missing`. The argument is the same as in the previous subsection.
325358
//!
326-
//! When we then specialize with `North`, we have two arms left:
359+
//! Once we've specialized by a constructor `c` that is irrelevant for row `r`, we're guaranteed to
360+
//! only explore values irrelevant for `r`. If we then ever reach a point where we're only exploring
361+
//! values that are irrelevant to all of the rows (including the virtual wildcard row used for
362+
//! exhaustiveness), we skip that case entirely.
327363
//!
328-
//! ```ignore(partial code)
329-
//! (_) => {} // arm 1
330-
//! (50..) => {} // arm 2
364+
//!
365+
//! ## Example
366+
//!
367+
//! Let's go through a variation on the first example:
368+
//!
369+
//! ```compile_fail,E0004
370+
//! # let foo = (true, true, true);
371+
//! match foo {
372+
//! (true, _, true) => 1,
373+
//! (_, true, _) => 2,
374+
//! };
331375
//! ```
332376
//!
333-
//! Because `Missing` only matches wildcard rows, specializing with `Missing` is guaranteed to
334-
//! result in a subset of the rows obtained from specializing with anything else. This means that
335-
//! any row with a wildcard found useful when specializing with anything else would also be found
336-
//! useful in the `Missing` case. In our example, after specializing with `North` here we will not
337-
//! gain new information regarding the usefulness of arm 2 or of the fake wildcard row used for
338-
//! exhaustiveness. This allows us to skip cases.
377+
//! ```text
378+
//! ┐ Patterns:
379+
//! │ 1. `[(true, _, true)]`
380+
//! │ 2. `[(_, true, _)]`
381+
//! │ 3. `[_]` // virtual extra wildcard row
382+
//! │
383+
//! │ Specialize with `(,,)`:
384+
//! ├─┐ Patterns:
385+
//! │ │ 1. `[true, _, true]`
386+
//! │ │ 2. `[_, true, _]`
387+
//! │ │ 3. `[_, _, _]`
388+
//! │ │
389+
//! │ │ There are missing constructors in the first column (namely `false`), hence
390+
//! │ │ `true` is irrelevant for rows 2 and 3.
391+
//! │ │
392+
//! │ │ Specialize with `true`:
393+
//! │ ├─┐ Patterns:
394+
//! │ │ │ 1. `[_, true]`
395+
//! │ │ │ 2. `[true, _]` // now exploring irrelevant cases
396+
//! │ │ │ 3. `[_, _]` // now exploring irrelevant cases
397+
//! │ │ │
398+
//! │ │ │ There are missing constructors in the first column (namely `false`), hence
399+
//! │ │ │ `true` is irrelevant for rows 1 and 3.
400+
//! │ │ │
401+
//! │ │ │ Specialize with `true`:
402+
//! │ │ ├─┐ Patterns:
403+
//! │ │ │ │ 1. `[true]` // now exploring irrelevant cases
404+
//! │ │ │ │ 2. `[_]` // now exploring irrelevant cases
405+
//! │ │ │ │ 3. `[_]` // now exploring irrelevant cases
406+
//! │ │ │ │
407+
//! │ │ │ │ The current case is irrelevant for all rows: we backtrack immediately.
408+
//! │ │ ├─┘
409+
//! │ │ │
410+
//! │ │ │ Specialize with `false`:
411+
//! │ │ ├─┐ Patterns:
412+
//! │ │ │ │ 1. `[true]`
413+
//! │ │ │ │ 3. `[_]` // now exploring irrelevant cases
414+
//! │ │ │ │
415+
//! │ │ │ │ Specialize with `true`:
416+
//! │ │ │ ├─┐ Patterns:
417+
//! │ │ │ │ │ 1. `[]`
418+
//! │ │ │ │ │ 3. `[]` // now exploring irrelevant cases
419+
//! │ │ │ │ │
420+
//! │ │ │ │ │ Row 1 is therefore useful.
421+
//! │ │ │ ├─┘
422+
//! <etc...>
423+
//! ```
424+
//!
425+
//! Relevancy allowed us to skip the case `(true, true, _)` entirely. In some cases this pruning can
426+
//! give drastic speedups. The case this was built for is the following (#118437):
339427
//!
340-
//! When specializing, if there is a `Missing` case we call the other constructors "irrelevant".
341-
//! When there is no `Missing` case there are no irrelevant constructors.
428+
//! ```ignore(illustrative)
429+
//! match foo {
430+
//! (true, _, _, _, ..) => 1,
431+
//! (_, true, _, _, ..) => 2,
432+
//! (_, _, true, _, ..) => 3,
433+
//! (_, _, _, true, ..) => 4,
434+
//! ...
435+
//! }
436+
//! ```
342437
//!
343-
//! What happens then is: when we specialize a wildcard with an irrelevant constructor, we know we
344-
//! won't get new info for this row; we consider that row "irrelevant". Whenever all the rows are
345-
//! found irrelevant, we can safely skip the case entirely.
438+
//! Without considering relevancy, we would explore all 2^n combinations of the `true` and `Missing`
439+
//! constructors. Relevancy tells us that e.g. `(true, true, false, false, false, ...)` is
440+
//! irrelevant for all the rows. This allows us to skip all cases with more than one `true`
441+
//! constructor, changing the runtime from exponential to linear.
346442
//!
347-
//! In the example above, we will entirely skip the `(North, 50..)` case. This skipping was
348-
//! developped as a solution to #118437. It doesn't look like much but it can save us from
349-
//! exponential blowup.
350443
//!
351-
//! There's a subtlety regarding exhaustiveness: while this shortcutting doesn't affect correctness,
352-
//! it can affect which witnesses are reported. For example, in the following:
444+
//! ## Relevancy and exhaustiveness
353445
//!
354-
//! ```compile_fail,E0004
355-
//! # let foo = (true, true, true);
446+
//! For exhaustiveness, we do something slightly different w.r.t relevancy: we do not report
447+
//! witnesses of non-exhaustiveness that are irrelevant for the virtual wildcard row. For example,
448+
//! in:
449+
//!
450+
//! ```ignore(illustrative)
356451
//! match foo {
357-
//! (true, _, true) => {}
358-
//! (_, true, _) => {}
452+
//! (true, true) => {}
359453
//! }
360454
//! ```
361455
//!
362-
//! In this example we will skip the `(true, true, _)` case entirely. Thus `(true, true, false)`
363-
//! will not be reported as missing. In fact we go further than this: we deliberately do not report
364-
//! any cases that are irrelevant for the fake wildcard row. For example, in `match ... { (true,
365-
//! true) => {} }` we will not report `(true, false)` as missing. This was a deliberate choice made
366-
//! early in the development of rust; it so happens that it is beneficial for performance reasons
367-
//! too.
456+
//! we only report `(false, _)` as missing. This was a deliberate choice made early in the
457+
//! development of rust, for diagnostic and performance purposes. As showed in the previous section,
458+
//! ignoring irrelevant cases preserves usefulness, so this choice still correctly computes whether
459+
//! a match is exhaustive.
368460
//!
369461
//!
370462
//!
@@ -738,8 +830,8 @@ struct PatStack<'a, 'p, Cx: TypeCx> {
738830
// Rows of len 1 are very common, which is why `SmallVec[_; 2]` works well.
739831
pats: SmallVec<[&'a DeconstructedPat<'p, Cx>; 2]>,
740832
/// Sometimes we know that as far as this row is concerned, the current case is already handled
741-
/// by a different, more general, case. When all rows are irrelevant this allows us to skip many
742-
/// branches. This is purely an optimization. See at the top for details.
833+
/// by a different, more general, case. When the case is irrelevant for all rows this allows us
834+
/// to skip a case entirely. This is purely an optimization. See at the top for details.
743835
relevant: bool,
744836
}
745837

@@ -1251,10 +1343,8 @@ fn compute_exhaustiveness_and_usefulness<'a, 'p, Cx: TypeCx>(
12511343

12521344
if !matrix.wildcard_row.relevant && matrix.rows().all(|r| !r.pats.relevant) {
12531345
// Here we know that nothing will contribute further to exhaustiveness or usefulness. This
1254-
// is purely an optimization: skipping this check doesn't affect correctness. This check
1255-
// does change runtime behavior from exponential to quadratic on some matches found in the
1256-
// wild, so it's pretty important. It also affects which missing patterns will be reported.
1257-
// See the top of the file for details.
1346+
// is purely an optimization: skipping this check doesn't affect correctness. See the top of
1347+
// the file for details.
12581348
return WitnessMatrix::empty();
12591349
}
12601350

@@ -1275,7 +1365,7 @@ fn compute_exhaustiveness_and_usefulness<'a, 'p, Cx: TypeCx>(
12751365
return if matrix.wildcard_row.relevant {
12761366
WitnessMatrix::unit_witness()
12771367
} else {
1278-
// We can omit the witness without affecting correctness, so we do.
1368+
// We choose to not report anything here; see at the top for details.
12791369
WitnessMatrix::empty()
12801370
};
12811371
};

0 commit comments

Comments
 (0)