Skip to content

Commit 4ee4a2a

Browse files
committed
borrow from @[] vectors (cc #2797)
1 parent bb5e2ba commit 4ee4a2a

File tree

2 files changed

+224
-150
lines changed

2 files changed

+224
-150
lines changed

src/rustc/middle/borrowck.rs

Lines changed: 213 additions & 145 deletions
Original file line numberDiff line numberDiff line change
@@ -1,149 +1,217 @@
11
/*!
2-
* # Borrow check
3-
*
4-
* This pass is in job of enforcing *memory safety* and *purity*. As
5-
* memory safety is by far the more complex topic, I'll focus on that in
6-
* this description, but purity will be covered later on. In the context
7-
* of Rust, memory safety means three basic things:
8-
*
9-
* - no writes to immutable memory;
10-
* - all pointers point to non-freed memory;
11-
* - all pointers point to memory of the same type as the pointer.
12-
*
13-
* The last point might seem confusing: after all, for the most part,
14-
* this condition is guaranteed by the type check. However, there are
15-
* two cases where the type check effectively delegates to borrow check.
16-
*
17-
* The first case has to do with enums. If there is a pointer to the
18-
* interior of an enum, and the enum is in a mutable location (such as a
19-
* local variable or field declared to be mutable), it is possible that
20-
* the user will overwrite the enum with a new value of a different
21-
* variant, and thus effectively change the type of the memory that the
22-
* pointer is pointing at.
23-
*
24-
* The second case has to do with mutability. Basically, the type
25-
* checker has only a limited understanding of mutability. It will allow
26-
* (for example) the user to get an immutable pointer with the address of
27-
* a mutable local variable. It will also allow a `@mut T` or `~mut T`
28-
* pointer to be borrowed as a `&r.T` pointer. These seeming oversights
29-
* are in fact intentional; they allow the user to temporarily treat a
30-
* mutable value as immutable. It is up to the borrow check to guarantee
31-
* that the value in question is not in fact mutated during the lifetime
32-
* `r` of the reference.
33-
*
34-
* # Summary of the safety check
35-
*
36-
* In order to enforce mutability, the borrow check has three tricks up
37-
* its sleeve.
38-
*
39-
* First, data which is uniquely tied to the current stack frame (that'll
40-
* be defined shortly) is tracked very precisely. This means that, for
41-
* example, if an immutable pointer to a mutable local variable is
42-
* created, the borrowck will simply check for assignments to that
43-
* particular local variable: no other memory is affected.
44-
*
45-
* Second, if the data is not uniquely tied to the stack frame, it may
46-
* still be possible to ensure its validity by rooting garbage collected
47-
* pointers at runtime. For example, if there is a mutable local
48-
* variable `x` of type `@T`, and its contents are borrowed with an
49-
* expression like `&*x`, then the value of `x` will be rooted (today,
50-
* that means its ref count will be temporary increased) for the lifetime
51-
* of the reference that is created. This means that the pointer remains
52-
* valid even if `x` is reassigned.
53-
*
54-
* Finally, if neither of these two solutions are applicable, then we
55-
* require that all operations within the scope of the reference be
56-
* *pure*. A pure operation is effectively one that does not write to
57-
* any aliasable memory. This means that it is still possible to write
58-
* to local variables or other data that is uniquely tied to the stack
59-
* frame (there's that term again; formal definition still pending) but
60-
* not to data reached via a `&T` or `@T` pointer. Such writes could
61-
* possibly have the side-effect of causing the data which must remain
62-
* valid to be overwritten.
63-
*
64-
* # Possible future directions
65-
*
66-
* There are numerous ways that the `borrowck` could be strengthened, but
67-
* these are the two most likely:
68-
*
69-
* - flow-sensitivity: we do not currently consider flow at all but only
70-
* block-scoping. This means that innocent code like the following is
71-
* rejected:
72-
*
73-
* let mut x: int;
74-
* ...
75-
* x = 5;
76-
* let y: &int = &x; // immutable ptr created
77-
* ...
78-
*
79-
* The reason is that the scope of the pointer `y` is the entire
80-
* enclosing block, and the assignment `x = 5` occurs within that
81-
* block. The analysis is not smart enough to see that `x = 5` always
82-
* happens before the immutable pointer is created. This is relatively
83-
* easy to fix and will surely be fixed at some point.
84-
*
85-
* - finer-grained purity checks: currently, our fallback for
86-
* guaranteeing random references into mutable, aliasable memory is to
87-
* require *total purity*. This is rather strong. We could use local
88-
* type-based alias analysis to distinguish writes that could not
89-
* possibly invalid the references which must be guaranteed. This
90-
* would only work within the function boundaries; function calls would
91-
* still require total purity. This seems less likely to be
92-
* implemented in the short term as it would make the code
93-
* significantly more complex; there is currently no code to analyze
94-
* the types and determine the possible impacts of a write.
95-
*
96-
* # Terminology
97-
*
98-
* A **loan** is .
99-
*
100-
* # How the code works
101-
*
102-
* The borrow check code is divided into several major modules, each of
103-
* which is documented in its own file.
104-
*
105-
* The `gather_loans` and `check_loans` are the two major passes of the
106-
* analysis. The `gather_loans` pass runs over the IR once to determine
107-
* what memory must remain valid and for how long. Its name is a bit of
108-
* a misnomer; it does in fact gather up the set of loans which are
109-
* granted, but it also determines when @T pointers must be rooted and
110-
* for which scopes purity must be required.
111-
*
112-
* The `check_loans` pass walks the IR and examines the loans and purity
113-
* requirements computed in `gather_loans`. It checks to ensure that (a)
114-
* the conditions of all loans are honored; (b) no contradictory loans
115-
* were granted (for example, loaning out the same memory as mutable and
116-
* immutable simultaneously); and (c) any purity requirements are
117-
* honored.
118-
*
119-
* The remaining modules are helper modules used by `gather_loans` and
120-
* `check_loans`:
121-
*
122-
* - `categorization` has the job of analyzing an expression to determine
123-
* what kind of memory is used in evaluating it (for example, where
124-
* dereferences occur and what kind of pointer is dereferenced; whether
125-
* the memory is mutable; etc)
126-
* - `loan` determines when data uniquely tied to the stack frame can be
127-
* loaned out.
128-
* - `preserve` determines what actions (if any) must be taken to preserve
129-
* aliasable data. This is the code which decides when to root
130-
* an @T pointer or to require purity.
131-
*
132-
* # Maps that are created
133-
*
134-
* Borrowck results in two maps.
135-
*
136-
* - `root_map`: identifies those expressions or patterns whose result
137-
* needs to be rooted. Conceptually the root_map maps from an
138-
* expression or pattern node to a `node_id` identifying the scope for
139-
* which the expression must be rooted (this `node_id` should identify
140-
* a block or call). The actual key to the map is not an expression id,
141-
* however, but a `root_map_key`, which combines an expression id with a
142-
* deref count and is used to cope with auto-deref.
143-
*
144-
* - `mutbl_map`: identifies those local variables which are modified or
145-
* moved. This is used by trans to guarantee that such variables are
146-
* given a memory location and not used as immediates.
2+
# Borrow check
3+
4+
This pass is in job of enforcing *memory safety* and *purity*. As
5+
memory safety is by far the more complex topic, I'll focus on that in
6+
this description, but purity will be covered later on. In the context
7+
of Rust, memory safety means three basic things:
8+
9+
- no writes to immutable memory;
10+
- all pointers point to non-freed memory;
11+
- all pointers point to memory of the same type as the pointer.
12+
13+
The last point might seem confusing: after all, for the most part,
14+
this condition is guaranteed by the type check. However, there are
15+
two cases where the type check effectively delegates to borrow check.
16+
17+
The first case has to do with enums. If there is a pointer to the
18+
interior of an enum, and the enum is in a mutable location (such as a
19+
local variable or field declared to be mutable), it is possible that
20+
the user will overwrite the enum with a new value of a different
21+
variant, and thus effectively change the type of the memory that the
22+
pointer is pointing at.
23+
24+
The second case has to do with mutability. Basically, the type
25+
checker has only a limited understanding of mutability. It will allow
26+
(for example) the user to get an immutable pointer with the address of
27+
a mutable local variable. It will also allow a `@mut T` or `~mut T`
28+
pointer to be borrowed as a `&r.T` pointer. These seeming oversights
29+
are in fact intentional; they allow the user to temporarily treat a
30+
mutable value as immutable. It is up to the borrow check to guarantee
31+
that the value in question is not in fact mutated during the lifetime
32+
`r` of the reference.
33+
34+
# Definition of unstable memory
35+
36+
The primary danger to safety arises due to *unstable memory*.
37+
Unstable memory is memory whose validity or type may change as a
38+
result of an assignment, move, or a variable going out of scope.
39+
There are two cases in Rust where memory is unstable: the contents of
40+
unique boxes and enums.
41+
42+
Unique boxes are unstable because when the variable containing the
43+
unique box is re-assigned, moves, or goes out of scope, the unique box
44+
is freed or---in the case of a move---potentially given to another
45+
task. In either case, if there is an extant and usable pointer into
46+
the box, then safety guarantees would be compromised.
47+
48+
Enum values are unstable because they are reassigned the types of
49+
their contents may change if they are assigned with a different
50+
variant than they had previously.
51+
52+
# Safety criteria that must be enforced
53+
54+
Whenever a piece of memory is borrowed for lifetime L, there are two
55+
things which the borrow checker must guarantee. First, it must
56+
guarantee that the memory address will remain allocated (and owned by
57+
the current task) for the entirety of the lifetime L. Second, it must
58+
guarantee that the type of the data will not change for the entirety
59+
of the lifetime L. In exchange, the region-based type system will
60+
guarantee that the pointer is not used outside the lifetime L. These
61+
guarantees are to some extent independent but are also inter-related.
62+
63+
In some cases, the type of a pointer cannot be invalidated but the
64+
lifetime can. For example, imagine a pointer to the interior of
65+
a shared box like:
66+
67+
let mut x = @mut {f: 5, g: 6};
68+
let y = &mut x.f;
69+
70+
Here, a pointer was created to the interior of a shared box which
71+
contains a record. Even if `*x` were to be mutated like so:
72+
73+
*x = {f: 6, g: 7};
74+
75+
This would cause `*y` to change from 5 to 6, but the pointer pointer
76+
`y` remains valid. It still points at an integer even if that integer
77+
has been overwritten.
78+
79+
However, if we were to reassign `x` itself, like so:
80+
81+
x = @{f: 6, g: 7};
82+
83+
This could potentially invalidate `y`, because if `x` were the final
84+
reference to the shared box, then that memory would be released and
85+
now `y` points at freed memory. (We will see that to prevent this
86+
scenario we will *root* shared boxes that reside in mutable memory
87+
whose contents are borrowed; rooting means that we create a temporary
88+
to ensure that the box is not collected).
89+
90+
In other cases, like an enum on the stack, the memory cannot be freed
91+
but its type can change:
92+
93+
let mut x = some(5);
94+
alt x {
95+
some(ref y) => { ... }
96+
none => { ... }
97+
}
98+
99+
Here as before, the pointer `y` would be invalidated if we were to
100+
reassign `x` to `none`. (We will see that this case is prevented
101+
because borrowck tracks data which resides on the stack and prevents
102+
variables from reassigned if there may be pointers to their interior)
103+
104+
Finally, in some cases, both dangers can arise. For example, something
105+
like the following:
106+
107+
let mut x = ~some(5);
108+
alt x {
109+
~some(ref y) => { ... }
110+
~none => { ... }
111+
}
112+
113+
In this case, if `x` to be reassigned or `*x` were to be mutated, then
114+
the pointer `y` would be invalided. (This case is also prevented by
115+
borrowck tracking data which is owned by the current stack frame)
116+
117+
# Summary of the safety check
118+
119+
In order to enforce mutability, the borrow check has a few tricks up
120+
its sleeve:
121+
122+
- When data is owned by the current stack frame, we can identify every
123+
possible assignment to a local variable and simply prevent
124+
potentially dangerous assignments directly.
125+
126+
- If data is owned by a shared box, we can root the box to increase
127+
its lifetime.
128+
129+
- If data is found within a borrowed pointer, we can assume that the
130+
data will remain live for the entirety of the borrowed pointer.
131+
132+
- We can rely on the fact that pure actions (such as calling pure
133+
functions) do not mutate data which is not owned by the current
134+
stack frame.
135+
136+
# Possible future directions
137+
138+
There are numerous ways that the `borrowck` could be strengthened, but
139+
these are the two most likely:
140+
141+
- flow-sensitivity: we do not currently consider flow at all but only
142+
block-scoping. This means that innocent code like the following is
143+
rejected:
144+
145+
let mut x: int;
146+
...
147+
x = 5;
148+
let y: &int = &x; // immutable ptr created
149+
...
150+
151+
The reason is that the scope of the pointer `y` is the entire
152+
enclosing block, and the assignment `x = 5` occurs within that
153+
block. The analysis is not smart enough to see that `x = 5` always
154+
happens before the immutable pointer is created. This is relatively
155+
easy to fix and will surely be fixed at some point.
156+
157+
- finer-grained purity checks: currently, our fallback for
158+
guaranteeing random references into mutable, aliasable memory is to
159+
require *total purity*. This is rather strong. We could use local
160+
type-based alias analysis to distinguish writes that could not
161+
possibly invalid the references which must be guaranteed. This
162+
would only work within the function boundaries; function calls would
163+
still require total purity. This seems less likely to be
164+
implemented in the short term as it would make the code
165+
significantly more complex; there is currently no code to analyze
166+
the types and determine the possible impacts of a write.
167+
168+
# How the code works
169+
170+
The borrow check code is divided into several major modules, each of
171+
which is documented in its own file.
172+
173+
The `gather_loans` and `check_loans` are the two major passes of the
174+
analysis. The `gather_loans` pass runs over the IR once to determine
175+
what memory must remain valid and for how long. Its name is a bit of
176+
a misnomer; it does in fact gather up the set of loans which are
177+
granted, but it also determines when @T pointers must be rooted and
178+
for which scopes purity must be required.
179+
180+
The `check_loans` pass walks the IR and examines the loans and purity
181+
requirements computed in `gather_loans`. It checks to ensure that (a)
182+
the conditions of all loans are honored; (b) no contradictory loans
183+
were granted (for example, loaning out the same memory as mutable and
184+
immutable simultaneously); and (c) any purity requirements are
185+
honored.
186+
187+
The remaining modules are helper modules used by `gather_loans` and
188+
`check_loans`:
189+
190+
- `categorization` has the job of analyzing an expression to determine
191+
what kind of memory is used in evaluating it (for example, where
192+
dereferences occur and what kind of pointer is dereferenced; whether
193+
the memory is mutable; etc)
194+
- `loan` determines when data uniquely tied to the stack frame can be
195+
loaned out.
196+
- `preserve` determines what actions (if any) must be taken to preserve
197+
aliasable data. This is the code which decides when to root
198+
an @T pointer or to require purity.
199+
200+
# Maps that are created
201+
202+
Borrowck results in two maps.
203+
204+
- `root_map`: identifies those expressions or patterns whose result
205+
needs to be rooted. Conceptually the root_map maps from an
206+
expression or pattern node to a `node_id` identifying the scope for
207+
which the expression must be rooted (this `node_id` should identify
208+
a block or call). The actual key to the map is not an expression id,
209+
however, but a `root_map_key`, which combines an expression id with a
210+
deref count and is used to cope with auto-deref.
211+
212+
- `mutbl_map`: identifies those local variables which are modified or
213+
moved. This is used by trans to guarantee that such variables are
214+
given a memory location and not used as immediates.
147215
*/
148216

149217
import syntax::ast;

src/rustc/middle/borrowck/categorization.rs

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -361,12 +361,18 @@ impl public_methods for borrowck_ctxt {
361361

362362
ret alt deref_kind(self.tcx, base_cmt.ty) {
363363
deref_ptr(ptr) {
364-
// make deref of vectors explicit, as explained in the comment at
365-
// the head of this section
366-
let deref_lp = base_cmt.lp.map(|lp| @lp_deref(lp, ptr) );
364+
// (a) the contents are loanable if the base is loanable
365+
// and this is a *unique* vector
366+
let deref_lp = alt ptr {
367+
uniq_ptr => {base_cmt.lp.map(|lp| @lp_deref(lp, uniq_ptr))}
368+
_ => {none}
369+
};
370+
371+
// (b) the deref is explicit in the resulting cmt
367372
let deref_cmt = @{id:expr.id, span:expr.span,
368-
cat:cat_deref(base_cmt, 0u, ptr), lp:deref_lp,
369-
mutbl:m_imm, ty:mt.ty};
373+
cat:cat_deref(base_cmt, 0u, ptr), lp:deref_lp,
374+
mutbl:m_imm, ty:mt.ty};
375+
370376
comp(expr, deref_cmt, base_cmt.ty, mt)
371377
}
372378

0 commit comments

Comments
 (0)