Skip to content

Commit c9a1c37

Browse files
committed
Added fragments.rs: compute drop obligations remaining post moves.
Includes differentiation between assigned_fragments and moved_fragments, support for all-but-one array fragments, and instrumentation to print out the moved/assigned/unmmoved/parents for each function, factored out into separate submodule.
1 parent 21fe017 commit c9a1c37

File tree

9 files changed

+924
-8
lines changed

9 files changed

+924
-8
lines changed

src/librustc/middle/borrowck/doc.rs

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ These docs are long. Search for the section you are interested in.
2727
- Formal model
2828
- Borrowing and loans
2929
- Moves and initialization
30+
- Drop flags and structural fragments
3031
- Future work
3132
3233
# Overview
@@ -1019,6 +1020,175 @@ walk back over, identify all uses, assignments, and captures, and
10191020
check that they are legal given the set of dataflow bits we have
10201021
computed for that program point.
10211022
1023+
# Drop flags and structural fragments
1024+
1025+
In addition to the job of enforcing memory safety, the borrow checker
1026+
code is also responsible for identifying the *structural fragments* of
1027+
data in the function, to support out-of-band dynamic drop flags
1028+
allocated on the stack. (For background, see [RFC PR #320].)
1029+
1030+
[RFC PR #320]: https://github.com/rust-lang/rfcs/pull/320
1031+
1032+
Semantically, each piece of data that has a destructor may need a
1033+
boolean flag to indicate whether or not its destructor has been run
1034+
yet. However, in many cases there is no need to actually maintain such
1035+
a flag: It can be apparent from the code itself that a given path is
1036+
always initialized (or always deinitialized) when control reaches the
1037+
end of its owner's scope, and thus we can unconditionally emit (or
1038+
not) the destructor invocation for that path.
1039+
1040+
A simple example of this is the following:
1041+
1042+
```rust
1043+
struct D { p: int }
1044+
impl D { fn new(x: int) -> D { ... }
1045+
impl Drop for D { ... }
1046+
1047+
fn foo(a: D, b: D, t: || -> bool) {
1048+
let c: D;
1049+
let d: D;
1050+
if t() { c = b; }
1051+
}
1052+
```
1053+
1054+
At the end of the body of `foo`, the compiler knows that `a` is
1055+
initialized, introducing a drop obligation (deallocating the boxed
1056+
integer) for the end of `a`'s scope that is run unconditionally.
1057+
Likewise the compiler knows that `d` is not initialized, and thus it
1058+
leave out the drop code for `d`.
1059+
1060+
The compiler cannot statically know the drop-state of `b` nor `c` at
1061+
the end of their scope, since that depends on the value of
1062+
`t`. Therefore, we need to insert boolean flags to track whether we
1063+
need to drop `b` and `c`.
1064+
1065+
However, the matter is not as simple as just mapping local variables
1066+
to their corresponding drop flags when necessary. In particular, in
1067+
addition to being able to move data out of local variables, Rust
1068+
allows one to move values in and out of structured data.
1069+
1070+
Consider the following:
1071+
1072+
```rust
1073+
struct S { x: D, y: D, z: D }
1074+
1075+
fn foo(a: S, mut b: S, t: || -> bool) {
1076+
let mut c: S;
1077+
let d: S;
1078+
let e: S = a.clone();
1079+
if t() {
1080+
c = b;
1081+
b.x = e.y;
1082+
}
1083+
if t() { c.y = D::new(4); }
1084+
}
1085+
```
1086+
1087+
As before, the drop obligations of `a` and `d` can be statically
1088+
determined, and again the state of `b` and `c` depend on dynamic
1089+
state. But additionally, the dynamic drop obligations introduced by
1090+
`b` and `c` are not just per-local boolean flags. For example, if the
1091+
first call to `t` returns `false` and the second call `true`, then at
1092+
the end of their scope, `b` will be completely initialized, but only
1093+
`c.y` in `c` will be initialized. If both calls to `t` return `true`,
1094+
then at the end of their scope, `c` will be completely initialized,
1095+
but only `b.x` will be initialized in `b`, and only `e.x` and `e.z`
1096+
will be initialized in `e`.
1097+
1098+
Note that we need to cover the `z` field in each case in some way,
1099+
since it may (or may not) need to be dropped, even though `z` is never
1100+
directly mentioned in the body of the `foo` function. We call a path
1101+
like `b.z` a *fragment sibling* of `b.x`, since the field `z` comes
1102+
from the same structure `S` that declared the field `x` in `b.x`.
1103+
1104+
In general we need to maintain boolean flags that match the
1105+
`S`-structure of both `b` and `c`. In addition, we need to consult
1106+
such a flag when doing an assignment (such as `c.y = D::new(4);`
1107+
above), in order to know whether or not there is a previous value that
1108+
needs to be dropped before we do the assignment.
1109+
1110+
So for any given function, we need to determine what flags are needed
1111+
to track its drop obligations. Our strategy for determining the set of
1112+
flags is to represent the fragmentation of the structure explicitly:
1113+
by starting initially from the paths that are explicitly mentioned in
1114+
moves and assignments (such as `b.x` and `c.y` above), and then
1115+
traversing the structure of the path's type to identify leftover
1116+
*unmoved fragments*: assigning into `c.y` means that `c.x` and `c.z`
1117+
are leftover unmoved fragments. Each fragment represents a drop
1118+
obligation that may need to be tracked. Paths that are only moved or
1119+
assigned in their entirety (like `a` and `d`) are treated as a single
1120+
drop obligation.
1121+
1122+
The fragment construction process works by piggy-backing on the
1123+
existing `move_data` module. We already have callbacks that visit each
1124+
direct move and assignment; these form the basis for the sets of
1125+
moved_leaf_paths and assigned_leaf_paths. From these leaves, we can
1126+
walk up their parent chain to identify all of their parent paths.
1127+
We need to identify the parents because of cases like the following:
1128+
1129+
```rust
1130+
struct Pair<X,Y>{ x: X, y: Y }
1131+
fn foo(dd_d_d: Pair<Pair<Pair<D, D>, D>, D>) {
1132+
other_function(dd_d_d.x.y);
1133+
}
1134+
```
1135+
1136+
In this code, the move of the path `dd_d.x.y` leaves behind not only
1137+
the fragment drop-obligation `dd_d.x.x` but also `dd_d.y` as well.
1138+
1139+
Once we have identified the directly-referenced leaves and their
1140+
parents, we compute the left-over fragments, in the function
1141+
`fragments::add_fragment_siblings`. As of this writing this works by
1142+
looking at each directly-moved or assigned path P, and blindly
1143+
gathering all sibling fields of P (as well as siblings for the parents
1144+
of P, etc). After accumulating all such siblings, we filter out the
1145+
entries added as siblings of P that turned out to be
1146+
directly-referenced paths (or parents of directly referenced paths)
1147+
themselves, thus leaving the never-referenced "left-overs" as the only
1148+
thing left from the gathering step.
1149+
1150+
## Array structural fragments
1151+
1152+
A special case of the structural fragments discussed above are
1153+
the elements of an array that has been passed by value, such as
1154+
the following:
1155+
1156+
```rust
1157+
fn foo(a: [D, ..10], i: uint) -> D {
1158+
a[i]
1159+
}
1160+
```
1161+
1162+
The above code moves a single element out of the input array `a`.
1163+
The remainder of the array still needs to be dropped; i.e., it
1164+
is a structural fragment. Note that after performing such a move,
1165+
it is not legal to read from the array `a`. There are a number of
1166+
ways to deal with this, but the important thing to note is that
1167+
the semantics needs to distinguish in some manner between a
1168+
fragment that is the *entire* array versus a fragment that represents
1169+
all-but-one element of the array. A place where that distinction
1170+
would arise is the following:
1171+
1172+
```rust
1173+
fn foo(a: [D, ..10], b: [D, ..10], i: uint, t: bool) -> D {
1174+
if t {
1175+
a[i]
1176+
} else {
1177+
b[i]
1178+
}
1179+
1180+
// When control exits, we will need either to drop all of `a`
1181+
// and all-but-one of `b`, or to drop all of `b` and all-but-one
1182+
// of `a`.
1183+
}
1184+
```
1185+
1186+
There are a number of ways that the trans backend could choose to
1187+
compile this (e.g. a `[bool, ..10]` array for each such moved array;
1188+
or an `Option<uint>` for each moved array). From the viewpoint of the
1189+
borrow-checker, the important thing is to record what kind of fragment
1190+
is implied by the relevant moves.
1191+
10221192
# Future work
10231193
10241194
While writing up these docs, I encountered some rules I believe to be

0 commit comments

Comments
 (0)