Skip to content

Commit 2ed15c2

Browse files
authoredMar 13, 2018
Merge pull request #85 from mark-i-m/typeck
Add the contents of the typeck READMEs
2 parents ed04741 + e745674 commit 2ed15c2

File tree

7 files changed

+467
-1
lines changed

7 files changed

+467
-1
lines changed
 

‎src/SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@
3535
- [The SLG solver](./traits-slg.md)
3636
- [Bibliography](./traits-bibliography.md)
3737
- [Type checking](./type-checking.md)
38+
- [Method Lookup](./method-lookup.md)
39+
- [Variance](./variance.md)
3840
- [The MIR (Mid-level IR)](./mir.md)
3941
- [MIR construction](./mir-construction.md)
4042
- [MIR visitor and traversal](./mir-visitor.md)

‎src/appendix-background.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,9 @@ cycle.
9191
Check out the subtyping chapter from the
9292
[Rust Nomicon](https://doc.rust-lang.org/nomicon/subtyping.html).
9393

94+
See the [variance](./variance.html) chapter of this guide for more info on how
95+
the type checker handles variance.
96+
9497
<a name=free-vs-bound>
9598

9699
## What is a "free region" or a "free variable"? What about "bound region"?

‎src/appendix-code-index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,11 @@ Item | Kind | Short description | Chapter |
1414
`Session` | struct | The data associated with a compilation session | [the Parser], [The Rustc Driver] | [src/librustc/session/mod.html](https://github.com/rust-lang/rust/blob/master/src/librustc/session/mod.rs)
1515
`StringReader` | struct | This is the lexer used during parsing. It consumes characters from the raw source code being compiled and produces a series of tokens for use by the rest of the parser | [The parser] | [src/libsyntax/parse/lexer/mod.rs](https://github.com/rust-lang/rust/blob/master/src/libsyntax/parse/lexer/mod.rs)
1616
`TraitDef` | struct | This struct contains a trait's definition with type information | [The `ty` modules] | [src/librustc/ty/trait_def.rs](https://github.com/rust-lang/rust/blob/master/src/librustc/ty/trait_def.rs)
17+
`Ty<'tcx>` | struct | This is the internal representation of a type used for type checking | [Type checking] | [src/librustc/ty/mod.rs](https://github.com/rust-lang/rust/blob/master/src/librustc/ty/mod.rs)
1718
`TyCtxt<'cx, 'tcx, 'tcx>` | type | The "typing context". This is the central data structure in the compiler. It is the context that you use to perform all manner of queries. | [The `ty` modules] | [src/librustc/ty/context.rs](https://github.com/rust-lang/rust/blob/master/src/librustc/ty/context.rs)
1819

1920
[The HIR]: hir.html
2021
[The parser]: the-parser.html
2122
[The Rustc Driver]: rustc-driver.html
23+
[Type checking]: type-checking.html
2224
[The `ty` modules]: ty.html

‎src/appendix-glossary.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,8 @@ token | the smallest unit of parsing. Tokens are produced aft
5656
trans | the code to translate MIR into LLVM IR.
5757
trait reference | a trait and values for its type parameters ([see more](ty.html)).
5858
ty | the internal representation of a type ([see more](ty.html)).
59-
variance | variance determines how changes to a generic type/lifetime parameter affect subtyping; for example, if `T` is a subtype of `U`, then `Vec<T>` is a subtype `Vec<U>` because `Vec` is *covariant* in its generic parameter. See [the background chapter for more](./appendix-background.html#variance).
59+
UFCS | Universal Function Call Syntax. An unambiguous syntax for calling a method ([see more](type-checking.html)).
60+
variance | variance determines how changes to a generic type/lifetime parameter affect subtyping; for example, if `T` is a subtype of `U`, then `Vec<T>` is a subtype `Vec<U>` because `Vec` is *covariant* in its generic parameter. See [the background chapter](./appendix-background.html#variance) for a more general explanation. See the [variance chapter](./variance.html) for an explanation of how type checking handles variance.
6061

6162
[LLVM]: https://llvm.org/
6263
[lto]: https://llvm.org/docs/LinkTimeOptimization.html

‎src/method-lookup.md

Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# Method lookup
2+
3+
Method lookup can be rather complex due to the interaction of a number
4+
of factors, such as self types, autoderef, trait lookup, etc. This
5+
file provides an overview of the process. More detailed notes are in
6+
the code itself, naturally.
7+
8+
One way to think of method lookup is that we convert an expression of
9+
the form:
10+
11+
```rust
12+
receiver.method(...)
13+
```
14+
15+
into a more explicit UFCS form:
16+
17+
```rust
18+
Trait::method(ADJ(receiver), ...) // for a trait call
19+
ReceiverType::method(ADJ(receiver), ...) // for an inherent method call
20+
```
21+
22+
Here `ADJ` is some kind of adjustment, which is typically a series of
23+
autoderefs and then possibly an autoref (e.g., `&**receiver`). However
24+
we sometimes do other adjustments and coercions along the way, in
25+
particular unsizing (e.g., converting from `[T; n]` to `[T]`).
26+
27+
Method lookup is divided into two major phases:
28+
29+
1. Probing ([`probe.rs`][probe]). The probe phase is when we decide what method
30+
to call and how to adjust the receiver.
31+
2. Confirmation ([`confirm.rs`][confirm]). The confirmation phase "applies"
32+
this selection, updating the side-tables, unifying type variables, and
33+
otherwise doing side-effectful things.
34+
35+
One reason for this division is to be more amenable to caching. The
36+
probe phase produces a "pick" (`probe::Pick`), which is designed to be
37+
cacheable across method-call sites. Therefore, it does not include
38+
inference variables or other information.
39+
40+
[probe]: https://github.com/rust-lang/rust/blob/master/src/librustc_typeck/check/method/probe.rs
41+
[confirm]: https://github.com/rust-lang/rust/blob/master/src/librustc_typeck/check/method/confirm.rs
42+
43+
## The Probe phase
44+
45+
### Steps
46+
47+
The first thing that the probe phase does is to create a series of
48+
*steps*. This is done by progressively dereferencing the receiver type
49+
until it cannot be deref'd anymore, as well as applying an optional
50+
"unsize" step. So if the receiver has type `Rc<Box<[T; 3]>>`, this
51+
might yield:
52+
53+
```rust
54+
Rc<Box<[T; 3]>>
55+
Box<[T; 3]>
56+
[T; 3]
57+
[T]
58+
```
59+
60+
### Candidate assembly
61+
62+
We then search along those steps to create a list of *candidates*. A
63+
`Candidate` is a method item that might plausibly be the method being
64+
invoked. For each candidate, we'll derive a "transformed self type"
65+
that takes into account explicit self.
66+
67+
Candidates are grouped into two kinds, inherent and extension.
68+
69+
**Inherent candidates** are those that are derived from the
70+
type of the receiver itself. So, if you have a receiver of some
71+
nominal type `Foo` (e.g., a struct), any methods defined within an
72+
impl like `impl Foo` are inherent methods. Nothing needs to be
73+
imported to use an inherent method, they are associated with the type
74+
itself (note that inherent impls can only be defined in the same
75+
module as the type itself).
76+
77+
FIXME: Inherent candidates are not always derived from impls. If you
78+
have a trait object, such as a value of type `Box<ToString>`, then the
79+
trait methods (`to_string()`, in this case) are inherently associated
80+
with it. Another case is type parameters, in which case the methods of
81+
their bounds are inherent. However, this part of the rules is subject
82+
to change: when DST's "impl Trait for Trait" is complete, trait object
83+
dispatch could be subsumed into trait matching, and the type parameter
84+
behavior should be reconsidered in light of where clauses.
85+
86+
TODO: Is this FIXME still accurate?
87+
88+
**Extension candidates** are derived from imported traits. If I have
89+
the trait `ToString` imported, and I call `to_string()` on a value of
90+
type `T`, then we will go off to find out whether there is an impl of
91+
`ToString` for `T`. These kinds of method calls are called "extension
92+
methods". They can be defined in any module, not only the one that
93+
defined `T`. Furthermore, you must import the trait to call such a
94+
method.
95+
96+
So, let's continue our example. Imagine that we were calling a method
97+
`foo` with the receiver `Rc<Box<[T; 3]>>` and there is a trait `Foo`
98+
that defines it with `&self` for the type `Rc<U>` as well as a method
99+
on the type `Box` that defines `Foo` but with `&mut self`. Then we
100+
might have two candidates:
101+
102+
&Rc<Box<[T; 3]>> from the impl of `Foo` for `Rc<U>` where `U=Box<T; 3]>
103+
&mut Box<[T; 3]>> from the inherent impl on `Box<U>` where `U=[T; 3]`
104+
105+
### Candidate search
106+
107+
Finally, to actually pick the method, we will search down the steps,
108+
trying to match the receiver type against the candidate types. At
109+
each step, we also consider an auto-ref and auto-mut-ref to see whether
110+
that makes any of the candidates match. We pick the first step where
111+
we find a match.
112+
113+
In the case of our example, the first step is `Rc<Box<[T; 3]>>`,
114+
which does not itself match any candidate. But when we autoref it, we
115+
get the type `&Rc<Box<[T; 3]>>` which does match. We would then
116+
recursively consider all where-clauses that appear on the impl: if
117+
those match (or we cannot rule out that they do), then this is the
118+
method we would pick. Otherwise, we would continue down the series of
119+
steps.

‎src/type-checking.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,44 @@
11
# Type checking
2+
3+
The [`rustc_typeck`][typeck] crate contains the source for "type collection"
4+
and "type checking", as well as a few other bits of related functionality. (It
5+
draws heavily on the [type inference] and [trait solving].)
6+
7+
[typeck]: https://github.com/rust-lang/rust/tree/master/src/librustc_typeck
8+
[type inference]: type-inference.html
9+
[trait solving]: trait-resolution.html
10+
11+
## Type collection
12+
13+
Type "collection" is the process of converting the types found in the HIR
14+
(`hir::Ty`), which represent the syntactic things that the user wrote, into the
15+
**internal representation** used by the compiler (`Ty<'tcx>`) -- we also do
16+
similar conversions for where-clauses and other bits of the function signature.
17+
18+
To try and get a sense for the difference, consider this function:
19+
20+
```rust
21+
struct Foo { }
22+
fn foo(x: Foo, y: self::Foo) { .. }
23+
// ^^^ ^^^^^^^^^
24+
```
25+
26+
Those two parameters `x` and `y` each have the same type: but they will have
27+
distinct `hir::Ty` nodes. Those nodes will have different spans, and of course
28+
they encode the path somewhat differently. But once they are "collected" into
29+
`Ty<'tcx>` nodes, they will be represented by the exact same internal type.
30+
31+
Collection is defined as a bundle of [queries] for computing information about
32+
the various functions, traits, and other items in the crate being compiled.
33+
Note that each of these queries is concerned with *interprocedural* things --
34+
for example, for a function definition, collection will figure out the type and
35+
signature of the function, but it will not visit the *body* of the function in
36+
any way, nor examine type annotations on local variables (that's the job of
37+
type *checking*).
38+
39+
For more details, see the [`collect`][collect] module.
40+
41+
[queries]: query.html
42+
[collect]: https://github.com/rust-lang/rust/blob/master/src/librustc_typeck/collect.rs
43+
44+
**TODO**: actually talk about type checking...

‎src/variance.md

Lines changed: 296 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,296 @@
1+
# Variance of type and lifetime parameters
2+
3+
For a more general background on variance, see the [background] appendix.
4+
5+
[background]: ./appendix-background.html
6+
7+
During type checking we must infer the variance of type and lifetime
8+
parameters. The algorithm is taken from Section 4 of the paper ["Taming the
9+
Wildcards: Combining Definition- and Use-Site Variance"][pldi11] published in
10+
PLDI'11 and written by Altidor et al., and hereafter referred to as The Paper.
11+
12+
[pldi11]: https://people.cs.umass.edu/~yannis/variance-extended2011.pdf
13+
14+
This inference is explicitly designed *not* to consider the uses of
15+
types within code. To determine the variance of type parameters
16+
defined on type `X`, we only consider the definition of the type `X`
17+
and the definitions of any types it references.
18+
19+
We only infer variance for type parameters found on *data types*
20+
like structs and enums. In these cases, there is a fairly straightforward
21+
explanation for what variance means. The variance of the type
22+
or lifetime parameters defines whether `T<A>` is a subtype of `T<B>`
23+
(resp. `T<'a>` and `T<'b>`) based on the relationship of `A` and `B`
24+
(resp. `'a` and `'b`).
25+
26+
We do not infer variance for type parameters found on traits, functions,
27+
or impls. Variance on trait parameters can indeed make sense
28+
(and we used to compute it) but it is actually rather subtle in
29+
meaning and not that useful in practice, so we removed it. See the
30+
[addendum] for some details. Variances on function/impl parameters, on the
31+
other hand, doesn't make sense because these parameters are instantiated and
32+
then forgotten, they don't persist in types or compiled byproducts.
33+
34+
[addendum]: #addendum
35+
36+
> **Notation**
37+
>
38+
> We use the notation of The Paper throughout this chapter:
39+
>
40+
> - `+` is _covariance_.
41+
> - `-` is _contravariance_.
42+
> - `*` is _bivariance_.
43+
> - `o` is _invariance_.
44+
45+
## The algorithm
46+
47+
The basic idea is quite straightforward. We iterate over the types
48+
defined and, for each use of a type parameter `X`, accumulate a
49+
constraint indicating that the variance of `X` must be valid for the
50+
variance of that use site. We then iteratively refine the variance of
51+
`X` until all constraints are met. There is *always* a solution, because at
52+
the limit we can declare all type parameters to be invariant and all
53+
constraints will be satisfied.
54+
55+
As a simple example, consider:
56+
57+
```rust
58+
enum Option<A> { Some(A), None }
59+
enum OptionalFn<B> { Some(|B|), None }
60+
enum OptionalMap<C> { Some(|C| -> C), None }
61+
```
62+
63+
Here, we will generate the constraints:
64+
65+
1. V(A) <= +
66+
2. V(B) <= -
67+
3. V(C) <= +
68+
4. V(C) <= -
69+
70+
These indicate that (1) the variance of A must be at most covariant;
71+
(2) the variance of B must be at most contravariant; and (3, 4) the
72+
variance of C must be at most covariant *and* contravariant. All of these
73+
results are based on a variance lattice defined as follows:
74+
75+
* Top (bivariant)
76+
- +
77+
o Bottom (invariant)
78+
79+
Based on this lattice, the solution `V(A)=+`, `V(B)=-`, `V(C)=o` is the
80+
optimal solution. Note that there is always a naive solution which
81+
just declares all variables to be invariant.
82+
83+
You may be wondering why fixed-point iteration is required. The reason
84+
is that the variance of a use site may itself be a function of the
85+
variance of other type parameters. In full generality, our constraints
86+
take the form:
87+
88+
V(X) <= Term
89+
Term := + | - | * | o | V(X) | Term x Term
90+
91+
Here the notation `V(X)` indicates the variance of a type/region
92+
parameter `X` with respect to its defining class. `Term x Term`
93+
represents the "variance transform" as defined in the paper:
94+
95+
> If the variance of a type variable `X` in type expression `E` is `V2`
96+
and the definition-site variance of the [corresponding] type parameter
97+
of a class `C` is `V1`, then the variance of `X` in the type expression
98+
`C<E>` is `V3 = V1.xform(V2)`.
99+
100+
## Constraints
101+
102+
If I have a struct or enum with where clauses:
103+
104+
```rust
105+
struct Foo<T: Bar> { ... }
106+
```
107+
108+
you might wonder whether the variance of `T` with respect to `Bar` affects the
109+
variance `T` with respect to `Foo`. I claim no. The reason: assume that `T` is
110+
invariant with respect to `Bar` but covariant with respect to `Foo`. And then
111+
we have a `Foo<X>` that is upcast to `Foo<Y>`, where `X <: Y`. However, while
112+
`X : Bar`, `Y : Bar` does not hold. In that case, the upcast will be illegal,
113+
but not because of a variance failure, but rather because the target type
114+
`Foo<Y>` is itself just not well-formed. Basically we get to assume
115+
well-formedness of all types involved before considering variance.
116+
117+
### Dependency graph management
118+
119+
Because variance is a whole-crate inference, its dependency graph
120+
can become quite muddled if we are not careful. To resolve this, we refactor
121+
into two queries:
122+
123+
- `crate_variances` computes the variance for all items in the current crate.
124+
- `variances_of` accesses the variance for an individual reading; it
125+
works by requesting `crate_variances` and extracting the relevant data.
126+
127+
If you limit yourself to reading `variances_of`, your code will only
128+
depend then on the inference of that particular item.
129+
130+
Ultimately, this setup relies on the [red-green algorithm][rga]. In particular,
131+
every variance query effectively depends on all type definitions in the entire
132+
crate (through `crate_variances`), but since most changes will not result in a
133+
change to the actual results from variance inference, the `variances_of` query
134+
will wind up being considered green after it is re-evaluated.
135+
136+
[rga]: ./incremental-compilation.html
137+
138+
<a name=addendum>
139+
140+
## Addendum: Variance on traits
141+
142+
As mentioned above, we used to permit variance on traits. This was
143+
computed based on the appearance of trait type parameters in
144+
method signatures and was used to represent the compatibility of
145+
vtables in trait objects (and also "virtual" vtables or dictionary
146+
in trait bounds). One complication was that variance for
147+
associated types is less obvious, since they can be projected out
148+
and put to myriad uses, so it's not clear when it is safe to allow
149+
`X<A>::Bar` to vary (or indeed just what that means). Moreover (as
150+
covered below) all inputs on any trait with an associated type had
151+
to be invariant, limiting the applicability. Finally, the
152+
annotations (`MarkerTrait`, `PhantomFn`) needed to ensure that all
153+
trait type parameters had a variance were confusing and annoying
154+
for little benefit.
155+
156+
Just for historical reference, I am going to preserve some text indicating how
157+
one could interpret variance and trait matching.
158+
159+
### Variance and object types
160+
161+
Just as with structs and enums, we can decide the subtyping
162+
relationship between two object types `&Trait<A>` and `&Trait<B>`
163+
based on the relationship of `A` and `B`. Note that for object
164+
types we ignore the `Self` type parameter -- it is unknown, and
165+
the nature of dynamic dispatch ensures that we will always call a
166+
function that is expected the appropriate `Self` type. However, we
167+
must be careful with the other type parameters, or else we could
168+
end up calling a function that is expecting one type but provided
169+
another.
170+
171+
To see what I mean, consider a trait like so:
172+
173+
trait ConvertTo<A> {
174+
fn convertTo(&self) -> A;
175+
}
176+
177+
Intuitively, If we had one object `O=&ConvertTo<Object>` and another
178+
`S=&ConvertTo<String>`, then `S <: O` because `String <: Object`
179+
(presuming Java-like "string" and "object" types, my go to examples
180+
for subtyping). The actual algorithm would be to compare the
181+
(explicit) type parameters pairwise respecting their variance: here,
182+
the type parameter A is covariant (it appears only in a return
183+
position), and hence we require that `String <: Object`.
184+
185+
You'll note though that we did not consider the binding for the
186+
(implicit) `Self` type parameter: in fact, it is unknown, so that's
187+
good. The reason we can ignore that parameter is precisely because we
188+
don't need to know its value until a call occurs, and at that time (as
189+
you said) the dynamic nature of virtual dispatch means the code we run
190+
will be correct for whatever value `Self` happens to be bound to for
191+
the particular object whose method we called. `Self` is thus different
192+
from `A`, because the caller requires that `A` be known in order to
193+
know the return type of the method `convertTo()`. (As an aside, we
194+
have rules preventing methods where `Self` appears outside of the
195+
receiver position from being called via an object.)
196+
197+
### Trait variance and vtable resolution
198+
199+
But traits aren't only used with objects. They're also used when
200+
deciding whether a given impl satisfies a given trait bound. To set the
201+
scene here, imagine I had a function:
202+
203+
fn convertAll<A,T:ConvertTo<A>>(v: &[T]) {
204+
...
205+
}
206+
207+
Now imagine that I have an implementation of `ConvertTo` for `Object`:
208+
209+
impl ConvertTo<i32> for Object { ... }
210+
211+
And I want to call `convertAll` on an array of strings. Suppose
212+
further that for whatever reason I specifically supply the value of
213+
`String` for the type parameter `T`:
214+
215+
let mut vector = vec!["string", ...];
216+
convertAll::<i32, String>(vector);
217+
218+
Is this legal? To put another way, can we apply the `impl` for
219+
`Object` to the type `String`? The answer is yes, but to see why
220+
we have to expand out what will happen:
221+
222+
- `convertAll` will create a pointer to one of the entries in the
223+
vector, which will have type `&String`
224+
- It will then call the impl of `convertTo()` that is intended
225+
for use with objects. This has the type:
226+
227+
fn(self: &Object) -> i32
228+
229+
It is ok to provide a value for `self` of type `&String` because
230+
`&String <: &Object`.
231+
232+
OK, so intuitively we want this to be legal, so let's bring this back
233+
to variance and see whether we are computing the correct result. We
234+
must first figure out how to phrase the question "is an impl for
235+
`Object,i32` usable where an impl for `String,i32` is expected?"
236+
237+
Maybe it's helpful to think of a dictionary-passing implementation of
238+
type classes. In that case, `convertAll()` takes an implicit parameter
239+
representing the impl. In short, we *have* an impl of type:
240+
241+
V_O = ConvertTo<i32> for Object
242+
243+
and the function prototype expects an impl of type:
244+
245+
V_S = ConvertTo<i32> for String
246+
247+
As with any argument, this is legal if the type of the value given
248+
(`V_O`) is a subtype of the type expected (`V_S`). So is `V_O <: V_S`?
249+
The answer will depend on the variance of the various parameters. In
250+
this case, because the `Self` parameter is contravariant and `A` is
251+
covariant, it means that:
252+
253+
V_O <: V_S iff
254+
i32 <: i32
255+
String <: Object
256+
257+
These conditions are satisfied and so we are happy.
258+
259+
### Variance and associated types
260+
261+
Traits with associated types -- or at minimum projection
262+
expressions -- must be invariant with respect to all of their
263+
inputs. To see why this makes sense, consider what subtyping for a
264+
trait reference means:
265+
266+
<T as Trait> <: <U as Trait>
267+
268+
means that if I know that `T as Trait`, I also know that `U as
269+
Trait`. Moreover, if you think of it as dictionary passing style,
270+
it means that a dictionary for `<T as Trait>` is safe to use where
271+
a dictionary for `<U as Trait>` is expected.
272+
273+
The problem is that when you can project types out from `<T as
274+
Trait>`, the relationship to types projected out of `<U as Trait>`
275+
is completely unknown unless `T==U` (see #21726 for more
276+
details). Making `Trait` invariant ensures that this is true.
277+
278+
Another related reason is that if we didn't make traits with
279+
associated types invariant, then projection is no longer a
280+
function with a single result. Consider:
281+
282+
```
283+
trait Identity { type Out; fn foo(&self); }
284+
impl<T> Identity for T { type Out = T; ... }
285+
```
286+
287+
Now if I have `<&'static () as Identity>::Out`, this can be
288+
validly derived as `&'a ()` for any `'a`:
289+
290+
<&'a () as Identity> <: <&'static () as Identity>
291+
if &'static () < : &'a () -- Identity is contravariant in Self
292+
if 'static : 'a -- Subtyping rules for relations
293+
294+
This change otoh means that `<'static () as Identity>::Out` is
295+
always `&'static ()` (which might then be upcast to `'a ()`,
296+
separately). This was helpful in solving #21750.

0 commit comments

Comments
 (0)
Please sign in to comment.