|
8 | 8 | // option. This file may not be copied, modified, or distributed
|
9 | 9 | // except according to those terms.
|
10 | 10 |
|
| 11 | +/// This file includes the logic for exhaustiveness and usefulness checking for |
| 12 | +/// pattern-matching. Specifically, given a list of patterns for a type, we can |
| 13 | +/// tell whether: |
| 14 | +/// (a) the patterns cover every possible constructor for the type [exhaustiveness] |
| 15 | +/// (b) each pattern is necessary [usefulness] |
| 16 | +/// |
| 17 | +/// The algorithm implemented here is a modified version of the one described in: |
| 18 | +/// http://moscova.inria.fr/~maranget/papers/warn/index.html |
| 19 | +/// However, to save future implementors from reading the original paper, I'm going |
| 20 | +/// to summarise the algorithm here to hopefully save time and be a little clearer |
| 21 | +/// (without being so rigorous). |
| 22 | +/// |
| 23 | +/// The core of the algorithm revolves about a "usefulness" check. In particular, we |
| 24 | +/// are trying to compute a predicate `U(P, p_{m + 1})` where `P` is a list of patterns |
| 25 | +/// of length `m` for a compound (product) type with `n` components (we refer to this as |
| 26 | +/// a matrix). `U(P, p_{m + 1})` represents whether, given an existing list of patterns |
| 27 | +/// `p_1 ..= p_m`, adding a new pattern will be "useful" (that is, cover previously- |
| 28 | +/// uncovered values of the type). |
| 29 | +/// |
| 30 | +/// If we have this predicate, then we can easily compute both exhaustiveness of an |
| 31 | +/// entire set of patterns and the individual usefulness of each one. |
| 32 | +/// (a) the set of patterns is exhaustive iff `U(P, _)` is false (i.e. adding a wildcard |
| 33 | +/// match doesn't increase the number of values we're matching) |
| 34 | +/// (b) a pattern `p_i` is not useful if `U(P[0..=(i-1), p_i)` is false (i.e. adding a |
| 35 | +/// pattern to those that have come before it doesn't increase the number of values |
| 36 | +/// we're matching). |
| 37 | +/// |
| 38 | +/// For example, say we have the following: |
| 39 | +/// ``` |
| 40 | +/// // x: (Option<bool>, Result<()>) |
| 41 | +/// match x { |
| 42 | +/// (Some(true), _) => {} |
| 43 | +/// (None, Err(())) => {} |
| 44 | +/// (None, Err(_)) => {} |
| 45 | +/// } |
| 46 | +/// ``` |
| 47 | +/// Here, the matrix `P` is 3 x 2 (rows x columns). |
| 48 | +/// [ |
| 49 | +/// [Some(true), _], |
| 50 | +/// [None, Err(())], |
| 51 | +/// [None, Err(_)], |
| 52 | +/// ] |
| 53 | +/// We can tell it's not exhaustive, because `U(P, _)` is true (we're not covering |
| 54 | +/// `[Some(false), _]`, for instance). In addition, row 3 is not useful, because |
| 55 | +/// all the values it covers are already covered by row 2. |
| 56 | +/// |
| 57 | +/// To compute `U`, we must have two other concepts. |
| 58 | +/// 1. `S(c, P)` is a "specialised matrix", where `c` is a constructor (like `Some` or |
| 59 | +/// `None`). You can think of it as filtering `P` to just the rows whose *first* pattern |
| 60 | +/// can cover `c` (and expanding OR-patterns into distinct patterns), and then expanding |
| 61 | +/// the constructor into all of its components. |
| 62 | +/// |
| 63 | +/// It is computed as follows. For each row `p_i` of P, we have four cases: |
| 64 | +/// 1.1. `p_(i,1)= c(r_1, .., r_a)`. Then `S(c, P)` has a corresponding row: |
| 65 | +/// r_1, .., r_a, p_(i,2), .., p_(i,n) |
| 66 | +/// 1.2. `p_(i,1) = c'(r_1, .., r_a')` where `c ≠ c'`. Then `S(c, P)` has no |
| 67 | +/// corresponding row. |
| 68 | +/// 1.3. `p_(i,1) = _`. Then `S(c, P)` has a corresponding row: |
| 69 | +/// _, .., _, p_(i,2), .., p_(i,n) |
| 70 | +/// 1.4. `p_(i,1) = r_1 | r_2`. Then `S(c, P)` has corresponding rows inlined from: |
| 71 | +/// S(c, (r_1, p_(i,2), .., p_(i,n))) |
| 72 | +/// S(c, (r_2, p_(i,2), .., p_(i,n))) |
| 73 | +/// |
| 74 | +/// 2. `D(P)` is a "default matrix". This is used when we know there are missing |
| 75 | +/// constructor cases, but there might be existing wildcard patterns, so to check the |
| 76 | +/// usefulness of the matrix, we have to check all its *other* components. |
| 77 | +/// |
| 78 | +/// It is computed as follows. For each row `p_i` of P, we have three cases: |
| 79 | +/// 1.1. `p_(i,1)= c(r_1, .., r_a)`. Then `D(P)` has no corresponding row. |
| 80 | +/// 1.2. `p_(i,1) = _`. Then `D(P)` has a corresponding row: |
| 81 | +/// p_(i,2), .., p_(i,n) |
| 82 | +/// 1.3. `p_(i,1) = r_1 | r_2`. Then `D(P)` has corresponding rows inlined from: |
| 83 | +/// D((r_1, p_(i,2), .., p_(i,n))) |
| 84 | +/// D((r_2, p_(i,2), .., p_(i,n))) |
| 85 | +/// |
| 86 | +/// The algorithm for computing `U` |
| 87 | +/// ------------------------------- |
| 88 | +/// The algorithm is inductive (on the number of columns: i.e. components of tuple patterns). |
| 89 | +/// That means we're going to check the components from left-to-right, so the algorithm |
| 90 | +/// operates principally on the first component of the matrix and new pattern `p_{m + 1}`. |
| 91 | +/// |
| 92 | +/// Base case. (`n = 0`, i.e. an empty tuple pattern) |
| 93 | +/// - If `P` already contains an empty pattern (i.e. if the number of patterns `m > 0`), |
| 94 | +/// then `U(P, p_{m + 1})` is false. |
| 95 | +/// - Otherwise, `P` must be empty, so `U(P, p_{m + 1})` is true. |
| 96 | +/// |
| 97 | +/// Inductive step. (`n > 0`, i.e. 1 or more tuple pattern components) |
| 98 | +/// We're going to match on the new pattern, `p_{m + 1}`. |
| 99 | +/// - If `p_{m + 1} == c(r_1, .., r_a)`, then we have a constructor pattern. |
| 100 | +/// Thus, the usefulness of `p_{m + 1}` can be reduced to whether it is useful when |
| 101 | +/// we ignore all the patterns in `P` that involve other constructors. This is where |
| 102 | +/// `S(c, P)` comes in: |
| 103 | +/// `U(P, p_{m + 1}) := U(S(c, P), S(c, p_{m + 1}))` |
| 104 | +/// - If `p_{m + 1} == _`, then we have two more cases: |
| 105 | +/// + All the constructors of the first component of the type exist within |
| 106 | +/// all the rows (after having expanded OR-patterns). In this case: |
| 107 | +/// `U(P, p_{m + 1}) := ∨(k ϵ constructors) U(S(k, P), S(k, p_{m + 1}))` |
| 108 | +/// I.e. the pattern `p_{m + 1}` is only useful when all the constructors are |
| 109 | +/// present *if* its later components are useful for the respective constructors |
| 110 | +/// covered by `p_{m + 1}` (usually a single constructor, but all in the case of `_`). |
| 111 | +/// + Some constructors are not present in the existing rows (after having expanded |
| 112 | +/// OR-patterns). However, there might be wildcard patterns (`_`) present. Thus, we |
| 113 | +/// are only really concerned with the other patterns leading with wildcards. This is |
| 114 | +/// where `D` comes in: |
| 115 | +/// `U(P, p_{m + 1}) := U(D(P), p_({m + 1},2), .., p_({m + 1},n))` |
| 116 | +/// - If `p_{m + 1} == r_1 | r_2`, then the usefulness depends on each separately: |
| 117 | +/// `U(P, p_{m + 1}) := U(P, (r_1, p_({m + 1},2), .., p_({m + 1},n))) |
| 118 | +/// || U(P, (r_2, p_({m + 1},2), .., p_({m + 1},n)))` |
| 119 | +/// |
| 120 | +/// Modifications to the algorithm |
| 121 | +/// ------------------------------ |
| 122 | +/// The algorithm in the paper doesn't cover some of the special cases that arise in Rust, for |
| 123 | +/// example uninhabited types and variable-length slice patterns. These are drawn attention to |
| 124 | +/// throughout the code below. |
| 125 | +
|
11 | 126 | use self::Constructor::*;
|
12 | 127 | use self::Usefulness::*;
|
13 | 128 | use self::WitnessPreference::*;
|
|
0 commit comments