|
| 1 | +# Closure Expansion in rustc |
| 2 | + |
| 3 | +Let's start with a few examples |
| 4 | + |
| 5 | +### Example 1 |
| 6 | +```rust |
| 7 | +fn closure(f: impl Fn()) { |
| 8 | + f(); |
| 9 | +} |
| 10 | + |
| 11 | +fn main() { |
| 12 | + let x: i32 = 10; |
| 13 | + closure(|| println!("Hi {}", x)); // The closure just reads x. |
| 14 | + println!("Value of x after return {}", x); |
| 15 | +} |
| 16 | +``` |
| 17 | +Let's say the above is the content of a file called immut.rs. If we compile immut.rs using the command |
| 18 | +``` |
| 19 | +rustc +stage1 immut.rs -Zdump-mir=all |
| 20 | +``` |
| 21 | +we will see a newly generated directory in our current working directory called mir_dump, which will |
| 22 | +contain several files. If we look at file `rustc.main.-------.mir_map.0.mir`, we will find, among |
| 23 | +other things, it also contains this line: |
| 24 | + |
| 25 | +```rust,ignore |
| 26 | +_4 = &_1; // bb0[6]: scope 1 at immut.rs:7:13: 7:36 |
| 27 | +_3 = [[email protected]:7:13: 7:36] { x: move _4 }; // bb0[7]: scope 1 at immut.rs:7:13: 7:36 |
| 28 | +``` |
| 29 | +Here in first line `_4 = &_1;`, the mir_dump tells us that x was borrowed as an immutable reference. |
| 30 | +This is what we would hope as our closure just reads x. |
| 31 | + |
| 32 | +### Example 2 |
| 33 | +```rust |
| 34 | +fn closure(mut f: impl FnMut()) { |
| 35 | + f(); |
| 36 | +} |
| 37 | + |
| 38 | +fn main() { |
| 39 | + let mut x: i32 = 10; |
| 40 | + closure(|| { |
| 41 | + x += 10; // The closure mutates the value of x |
| 42 | + println!("Hi {}", x) |
| 43 | + }); |
| 44 | + println!("Value of x after return {}", x); |
| 45 | +} |
| 46 | +``` |
| 47 | + |
| 48 | +```rust,ignore |
| 49 | +_4 = &mut _1; // bb0[6]: scope 1 at mut.rs:7:13: 10:6 |
| 50 | +_3 = [[email protected]:7:13: 10:6] { x: move _4 }; // bb0[7]: scope 1 at mut.rs:7:13: 10:6 |
| 51 | +``` |
| 52 | +This time along, in the line `_4 = &mut _1;`, we see that the borrow is changed to mutable borrow. |
| 53 | +fair enough as the closure increments x by 10. |
| 54 | + |
| 55 | +### Example 3 |
| 56 | +```rust |
| 57 | +fn closure(f: impl FnOnce()) { |
| 58 | + f(); |
| 59 | +} |
| 60 | + |
| 61 | +fn main() { |
| 62 | + let x = vec![21]; |
| 63 | + closure(|| { |
| 64 | + drop(x); // Makes x unusable after the fact. |
| 65 | + }); |
| 66 | + // println!("Value of x after return {:?}", x); |
| 67 | +} |
| 68 | +``` |
| 69 | + |
| 70 | +```rust,ignore |
| 71 | +_6 = [[email protected]:7:13: 9:6] { x: move _1 }; // bb16[3]: scope 1 at move.rs:7:13: 9:6 |
| 72 | +``` |
| 73 | +Here, x is directly moved into the closure and the access to it will not be permitted after the |
| 74 | +closure. |
| 75 | + |
| 76 | + |
| 77 | +Now let's dive into rustc code and see how all these inferences are done by the compiler. |
| 78 | + |
| 79 | +Let's start with defining a term that we will be using quite a bit in the rest of the discussion - |
| 80 | +*upvar*. An **upvar** is a variable that is local to the function, where the closure is defined. So, |
| 81 | +in the above examples, **x** will be an upvar to the closure. They are also sometimes referred to as |
| 82 | +the *free variables* meaning they are not bound to the context of the closure. |
| 83 | +`src/librustc/ty/query/mod.rs` defines a query called *freevars* for this purpose. |
| 84 | + |
| 85 | +So, we know that other than lazy invocation, one other thing that the distinguishes a closure from a |
| 86 | +normal function is that it can use the upvars. Because, it borrows these upvars from its surrounding |
| 87 | +context, therfore the compiler has to determine the upvar's borrow type. The compiler starts with |
| 88 | +assigning an immutable borrow type and lowers the restriction (that is, changes it from |
| 89 | +**immutable** to **mutable** to **move**) as needed, based on the usage. In the Example 1 above, the |
| 90 | +closure only uses the variable for printing but does not modify it in any way and therefore, in the |
| 91 | +mir_dump, we find the borrow type for the upvar x to be immutable. In example 2, however the |
| 92 | +closure modifies x and increments it by some value. Because of this mutation, the compiler, which |
| 93 | +started off assigning x as an immutable reference type, has to adjust it as mutable reference. |
| 94 | +Likewise in the third example, the closure drops the vector and therefore this requires the variable |
| 95 | +x to be moved into the closure. Depending on the borrow kind, the closure has to implement the |
| 96 | +appropriate trait. Fn trait for immutable borrow, FnMut for mutable borrow and FnOnce for move |
| 97 | +semantics. |
| 98 | + |
| 99 | +Most of the code related to the closure is in the src/librustc_typeck/check/upvar.rs file and the |
| 100 | +data structures are declared in the file src/librustc/ty/mod.rs. |
| 101 | + |
| 102 | +Before we go any further, let's discuss how we can examine the flow of coontrol through the rustc |
| 103 | +codebase. For the closure part specifically, I would set the RUST_LOG as under and collect the |
| 104 | +output in a file |
| 105 | + |
| 106 | +``` |
| 107 | +RUST_LOG=rustc_typeck::check::upvar rustc +stage1 -Zdump-mir=all <.rs file to compile> 2> <file |
| 108 | +where the output will be dumped> |
| 109 | +``` |
| 110 | + |
| 111 | +This uses the stage1 compiler. |
| 112 | + |
| 113 | +The other option is to step through the code using lldb or gdb. |
| 114 | + |
| 115 | +``` |
| 116 | +1. rust-lldb build/x86_64-apple-darwin/stage1/bin/rustc test.rs |
| 117 | +2. b upvar.rs:134 // Setting the breakpoint on a certain line in the upvar.rs file |
| 118 | +3. r // Run the program until it hits the breakpoint |
| 119 | +``` |
| 120 | + |
| 121 | +Let's start with the file: `upvar.rs`. This file has something called the euv::ExprUseVisitor which |
| 122 | +walks the source of the closure and it gets called back for each upvar that is borrowed, mutated or |
| 123 | +moved. |
| 124 | + |
| 125 | +```rust |
| 126 | +fn main() { |
| 127 | + let x = vec![21]; |
| 128 | + let _cl = || { |
| 129 | + let y = x[0]; // 1. |
| 130 | + x[0] += 1; // 2. |
| 131 | + }; |
| 132 | +} |
| 133 | +``` |
| 134 | + |
| 135 | +In the above example, our visitor will be called twice, for the lines marked 1 and 2, once as a |
| 136 | +shared borrow and another one as a mutable borrow. It will also tell as what was borrowed. The |
| 137 | +callbacks get invoked at the delegate. The delegate is of type `struct InferBorrowKind` which has a |
| 138 | +few fields but the one we are interested in is the `adjust_upvar_captures` which is of type |
| 139 | +`FxHashMap<UpvarId, UpvarCapture<'tcx>>` which tells us for each upvar, which mode of borrow did we |
| 140 | +require. The modes of borrow can be ByValue (moved) or ByRef (borrowed) and for ByRef borrows, it |
| 141 | +can be one among shared, shallow, unique or mut as defined in the `src/librustc/mir/mod.rs` |
| 142 | + |
| 143 | +The method callbacks are the method implementations of the euv::Delegate trait for InferBorrowKind. |
| 144 | +**consume** callback is for *move* of a variable, **borrow** callback if there is a *borrow* of some |
| 145 | +kind, shared or mutable and **mutate** when we see an *assignment* of something. We will see that |
| 146 | +all these callbacks have a common argument *cmt* which stands for category, Mutability and Type and |
| 147 | +is defined in *src/librustc/middle/mem_categorization.rs*. Borrowing from the code comments *cmt *is |
| 148 | +a complete categorization of a value indicating where it originated and how it is located, as well |
| 149 | +as the mutability of the memory in which the value is stored.** Based on the callback (consume, |
| 150 | +borrow etc.), we will call the relevant *adjust_upvar_borrow_kind_for_<something>* and pass the cmt |
| 151 | +along. Once the borrow type is adjusted, we store it in the table, which basically says for this |
| 152 | +closure, these set of borrows were made. |
| 153 | + |
| 154 | +``` |
| 155 | +self.tables |
| 156 | + .borrow_mut() |
| 157 | + .upvar_capture_map |
| 158 | + .extend(delegate.adjust_upvar_captures); |
| 159 | +``` |
0 commit comments