Skip to content

Commit ea6f6b5

Browse files
blitzerrmark-i-m
authored andcommitted
Notes about closure de-sugaring
1 parent 8dfb8c1 commit ea6f6b5

File tree

2 files changed

+160
-0
lines changed

2 files changed

+160
-0
lines changed

src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
- [The HIR (High-level IR)](./hir.md)
4040
- [Lowering AST to HIR](./lowering.md)
4141
- [Debugging](./hir-debugging.md)
42+
- [Closure expansion](./closure.md)
4243
- [The `ty` module: representing types](./ty.md)
4344
- [Kinds](./kinds.md)
4445
- [Type inference](./type-inference.md)

src/closure.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# Closure Expansion in rustc
2+
3+
Let's start with a few examples
4+
5+
### Example 1
6+
```rust
7+
fn closure(f: impl Fn()) {
8+
f();
9+
}
10+
11+
fn main() {
12+
let x: i32 = 10;
13+
closure(|| println!("Hi {}", x)); // The closure just reads x.
14+
println!("Value of x after return {}", x);
15+
}
16+
```
17+
Let's say the above is the content of a file called immut.rs. If we compile immut.rs using the command
18+
```
19+
rustc +stage1 immut.rs -Zdump-mir=all
20+
```
21+
we will see a newly generated directory in our current working directory called mir_dump, which will
22+
contain several files. If we look at file `rustc.main.-------.mir_map.0.mir`, we will find, among
23+
other things, it also contains this line:
24+
25+
```rust,ignore
26+
_4 = &_1; // bb0[6]: scope 1 at immut.rs:7:13: 7:36
27+
_3 = [[email protected]:7:13: 7:36] { x: move _4 }; // bb0[7]: scope 1 at immut.rs:7:13: 7:36
28+
```
29+
Here in first line `_4 = &_1;`, the mir_dump tells us that x was borrowed as an immutable reference.
30+
This is what we would hope as our closure just reads x.
31+
32+
### Example 2
33+
```rust
34+
fn closure(mut f: impl FnMut()) {
35+
f();
36+
}
37+
38+
fn main() {
39+
let mut x: i32 = 10;
40+
closure(|| {
41+
x += 10; // The closure mutates the value of x
42+
println!("Hi {}", x)
43+
});
44+
println!("Value of x after return {}", x);
45+
}
46+
```
47+
48+
```rust,ignore
49+
_4 = &mut _1; // bb0[6]: scope 1 at mut.rs:7:13: 10:6
50+
_3 = [[email protected]:7:13: 10:6] { x: move _4 }; // bb0[7]: scope 1 at mut.rs:7:13: 10:6
51+
```
52+
This time along, in the line `_4 = &mut _1;`, we see that the borrow is changed to mutable borrow.
53+
fair enough as the closure increments x by 10.
54+
55+
### Example 3
56+
```rust
57+
fn closure(f: impl FnOnce()) {
58+
f();
59+
}
60+
61+
fn main() {
62+
let x = vec![21];
63+
closure(|| {
64+
drop(x); // Makes x unusable after the fact.
65+
});
66+
// println!("Value of x after return {:?}", x);
67+
}
68+
```
69+
70+
```rust,ignore
71+
_6 = [[email protected]:7:13: 9:6] { x: move _1 }; // bb16[3]: scope 1 at move.rs:7:13: 9:6
72+
```
73+
Here, x is directly moved into the closure and the access to it will not be permitted after the
74+
closure.
75+
76+
77+
Now let's dive into rustc code and see how all these inferences are done by the compiler.
78+
79+
Let's start with defining a term that we will be using quite a bit in the rest of the discussion -
80+
*upvar*. An **upvar** is a variable that is local to the function, where the closure is defined. So,
81+
in the above examples, **x** will be an upvar to the closure. They are also sometimes referred to as
82+
the *free variables* meaning they are not bound to the context of the closure.
83+
`src/librustc/ty/query/mod.rs` defines a query called *freevars* for this purpose.
84+
85+
So, we know that other than lazy invocation, one other thing that the distinguishes a closure from a
86+
normal function is that it can use the upvars. Because, it borrows these upvars from its surrounding
87+
context, therfore the compiler has to determine the upvar's borrow type. The compiler starts with
88+
assigning an immutable borrow type and lowers the restriction (that is, changes it from
89+
**immutable** to **mutable** to **move**) as needed, based on the usage. In the Example 1 above, the
90+
closure only uses the variable for printing but does not modify it in any way and therefore, in the
91+
mir_dump, we find the borrow type for the upvar x to be immutable. In example 2, however the
92+
closure modifies x and increments it by some value. Because of this mutation, the compiler, which
93+
started off assigning x as an immutable reference type, has to adjust it as mutable reference.
94+
Likewise in the third example, the closure drops the vector and therefore this requires the variable
95+
x to be moved into the closure. Depending on the borrow kind, the closure has to implement the
96+
appropriate trait. Fn trait for immutable borrow, FnMut for mutable borrow and FnOnce for move
97+
semantics.
98+
99+
Most of the code related to the closure is in the src/librustc_typeck/check/upvar.rs file and the
100+
data structures are declared in the file src/librustc/ty/mod.rs.
101+
102+
Before we go any further, let's discuss how we can examine the flow of coontrol through the rustc
103+
codebase. For the closure part specifically, I would set the RUST_LOG as under and collect the
104+
output in a file
105+
106+
```
107+
RUST_LOG=rustc_typeck::check::upvar rustc +stage1 -Zdump-mir=all <.rs file to compile> 2> <file
108+
where the output will be dumped>
109+
```
110+
111+
This uses the stage1 compiler.
112+
113+
The other option is to step through the code using lldb or gdb.
114+
115+
```
116+
1. rust-lldb build/x86_64-apple-darwin/stage1/bin/rustc test.rs
117+
2. b upvar.rs:134 // Setting the breakpoint on a certain line in the upvar.rs file
118+
3. r // Run the program until it hits the breakpoint
119+
```
120+
121+
Let's start with the file: `upvar.rs`. This file has something called the euv::ExprUseVisitor which
122+
walks the source of the closure and it gets called back for each upvar that is borrowed, mutated or
123+
moved.
124+
125+
```rust
126+
fn main() {
127+
let x = vec![21];
128+
let _cl = || {
129+
let y = x[0]; // 1.
130+
x[0] += 1; // 2.
131+
};
132+
}
133+
```
134+
135+
In the above example, our visitor will be called twice, for the lines marked 1 and 2, once as a
136+
shared borrow and another one as a mutable borrow. It will also tell as what was borrowed. The
137+
callbacks get invoked at the delegate. The delegate is of type `struct InferBorrowKind` which has a
138+
few fields but the one we are interested in is the `adjust_upvar_captures` which is of type
139+
`FxHashMap<UpvarId, UpvarCapture<'tcx>>` which tells us for each upvar, which mode of borrow did we
140+
require. The modes of borrow can be ByValue (moved) or ByRef (borrowed) and for ByRef borrows, it
141+
can be one among shared, shallow, unique or mut as defined in the `src/librustc/mir/mod.rs`
142+
143+
The method callbacks are the method implementations of the euv::Delegate trait for InferBorrowKind.
144+
**consume** callback is for *move* of a variable, **borrow** callback if there is a *borrow* of some
145+
kind, shared or mutable and **mutate** when we see an *assignment* of something. We will see that
146+
all these callbacks have a common argument *cmt* which stands for category, Mutability and Type and
147+
is defined in *src/librustc/middle/mem_categorization.rs*. Borrowing from the code comments *cmt *is
148+
a complete categorization of a value indicating where it originated and how it is located, as well
149+
as the mutability of the memory in which the value is stored.** Based on the callback (consume,
150+
borrow etc.), we will call the relevant *adjust_upvar_borrow_kind_for_<something>* and pass the cmt
151+
along. Once the borrow type is adjusted, we store it in the table, which basically says for this
152+
closure, these set of borrows were made.
153+
154+
```
155+
self.tables
156+
.borrow_mut()
157+
.upvar_capture_map
158+
.extend(delegate.adjust_upvar_captures);
159+
```

0 commit comments

Comments
 (0)