|
1 | 1 | ## CBMC Assigns Clause
|
2 | 2 |
|
3 |
| - |
4 | 3 | ### Introduction
|
5 |
| -The _assigns_ clause allows the user to specify a list of memory |
6 |
| -locations which may be written within a particular scope. While an assigns |
7 |
| -clause may, in general, be interpreted with either _writes_ or _modifies_ |
8 |
| -semantics, this design is based on the former. This means that memory not |
9 |
| -captured by the assigns clause must not be written within the given scope, even |
10 |
| -if the value(s) therein are not modified. |
11 |
| - |
| 4 | +Here is described the syntax and semantics of the "assigns" clause as implemented in CBMC. |
| 5 | +This construct allows the user to specify a list of memory locations which may be written within a function. |
| 6 | +While an assigns clause may, in general, be interpreted with either "writes" or "modifies" semantics, this design is based on the former. |
| 7 | +This means that memory not captured by the assigns clause must not be written within the given scope, even if the value(s) are restored before the function returns. |
12 | 8 |
|
13 |
| -### Scalar Variables |
14 |
| -The initial iteration of this design focuses on specifying an assigns clause for |
15 |
| -primitive types and their pointers. Arrays, structured data, and pointers are |
16 |
| -left to future contributions. |
| 9 | +The latest iteration of this design focuses on specifying assigns clauses for simple variable types and their pointers, structs, and arrays. |
| 10 | +Recursive pointer structures are left to future work, as the current memory model does not allow them to be supported by the assigns clause. |
17 | 11 |
|
18 | 12 |
|
19 | 13 | ##### Syntax
|
20 |
| -A special construct is introduced to specify assigns clauses. Its syntax is |
21 |
| -defined as follows. |
| 14 | +A special construct is introduced to specify assigns clauses. At the time of writing, its syntax is defined as follows. |
22 | 15 |
|
23 | 16 | ```
|
24 | 17 | <assigns_clause> := __CPROVER_assigns ( <target_list> )
|
25 | 18 | ```
|
26 | 19 | ```
|
27 |
| - <target_list> := <target> |
28 |
| - | <target> , <target_list> |
29 |
| -``` |
| 20 | + <target_list> := __CPROVER_nothing |
| 21 | + | <target> |
| 22 | + | <target_list> , <target> |
30 | 23 | ```
|
31 |
| - <target> := <identifier> |
32 |
| - | * <target> |
33 | 24 | ```
|
| 25 | + <target> := <deref_target> |
| 26 | + | <target> [ array_index ] |
| 27 | + | <target> [ array_range ] |
34 | 28 |
|
| 29 | + <array_index> := <identifier> | <integer> |
| 30 | + <array_range> := <array_index> , <array_index> |
| 31 | +``` |
| 32 | +``` |
| 33 | + <deref_target> := <member_target> |
| 34 | + | * <deref_target> |
| 35 | +``` |
| 36 | +``` |
| 37 | + <member_target> := <primary_target> |
| 38 | + | <member_target> . <member_name> |
| 39 | + | <member_target> -> <member_name> |
| 40 | +``` |
| 41 | +``` |
| 42 | + <primary_target> := <identifier> |
| 43 | + | ( <target> ) |
| 44 | +``` |
35 | 45 |
|
36 |
| -The `<assigns_clause>` states that only the memory identified by the dereference |
37 |
| -expressions and identifiers listed in the contained `<target_list>` may be |
38 |
| -written within the associated scope, as follows. |
| 46 | +The assigns clause identifies a list of targets, which may be an identifier (e.g. `x`), a member of a struct (e.g. `myStruct.someMember` or `myStructPtr->otherMember`) , a dereferenced pointer (e.g. `*myPtr`), or an array range (e.g. `myArr[5, lastIdx]`). |
| 47 | +The `<assigns_clause>` states that only memory identified in the target list may be written within the function, as described in the next section. |
39 | 48 |
|
| 49 | +For example, consider the following declarations. |
| 50 | +``` |
| 51 | +struct pair |
| 52 | +{ |
| 53 | + int x; |
| 54 | + int y; |
| 55 | +}; |
| 56 | +
|
| 57 | +void foo(int *myInt1, int *myInt2, struct pair *myPair, int myArr[], int lastIdx) |
| 58 | +__CPROVER_assigns(*myInt1, myPair->y, myArr[5, lastIdx]); |
| 59 | +``` |
| 60 | +This code declares a struct type, called `pair` with 2 members `x` and `y`, as well as a function `foo` which takes two integer pointer arguments `myInt1` and `myInt2`, one pair struct pointer `myPair`, and an array argument `myArr` with an associated last index `lastIdx`. |
| 61 | +The assigns clause attached to this definition states that most of the arguments can be assigned. However `myInt2`, `myPair->x`, and the first four elements of `myArr` should not be assigned in the body of `foo`. |
40 | 62 |
|
41 | 63 | ##### Semantics
|
42 |
| -The semantics of an assigns clause *c* of some function *f* can be understood in |
43 |
| -two contexts. First, one may consider how the expressions in *c* are treated |
44 |
| -when a call to *f* is replaced by its contract specification, assuming the |
45 |
| -contract specification is a sound characterization of the behavior of *f*. |
46 |
| -Second, one may consider how *c* is applied to the function body of *f* in order |
47 |
| -to determine whether *c* is a sound characterization of the behavior of *f*. We |
48 |
| -begin by exploring these two perspectives for assigns clauses which contain only |
49 |
| -scalar expressions. |
50 |
| - |
51 |
| -Let the i<sup>th</sup> expression in some assigns clause *c* be denoted |
52 |
| -*exprs*<sub>*c*</sub>[i], the j<sup>th</sup> formal parameter of some function |
53 |
| -*f* be denoted *params*<sub>*f*</sub>[j], and the k<sup>th</sup> argument passed |
54 |
| -in a call to function *f* be denoted *args*<sub>*f*</sub>[k] (an identifying |
55 |
| -index may be added to distinguish a *particular* call to *f*, but for simplicity |
56 |
| -it is omitted here). |
| 64 | +The semantics of an assigns clause *c* of some function *f* can be understood in two contexts. |
| 65 | +First, one may consider how the expressions in *c* are treated when a call to *f* is replaced by its contract specification, assuming the contract specification is a sound characterization of the behavior of *f*. |
| 66 | +Second, one may consider how *c* is applied to the function body of *f* in order to determine whether *c* is a sound characterization of the behavior of *f*. |
| 67 | +To illustrate these two perspectives, consider the following definition of `foo`. |
| 68 | + |
| 69 | +``` |
| 70 | +void foo(int *myInt1, int *myInt2, struct pair *myPair, int myArr[], int lastIdx) |
| 71 | +{ |
| 72 | + *myInt1 = 34; |
| 73 | + myPair->y = 55; |
| 74 | + myArr[5] = 89; |
| 75 | +} |
| 76 | +``` |
57 | 77 |
|
| 78 | +We begin by exploring these two perspectives with a formal definition, and then an example. |
| 79 | + |
| 80 | +Let the i<sup>th</sup> expression in some assigns clause *c* be denoted *exprs*<sub>*c*</sub>[i], the j<sup>th</sup> formal parameter of some function *f* be denoted *params*<sub>*f*</sub>[j], and the k<sup>th</sup> argument passed in a call to function *f* be denoted *args*<sub>*f*</sub>[k] (an identifying index may be added to distinguish a *particular* call to *f*, but for simplicity it is omitted here). |
58 | 81 |
|
59 | 82 | ###### Replacement
|
60 |
| -Assuming an assigns clause *c* is a sound characterization of the behavior of |
61 |
| -the function *f*, a call to *f* may be replaced a series of non-deterministic |
62 |
| -assignments. For each expression *e* ∈ *exprs*<sub>*c*</sub>, let there be |
63 |
| -an assignment ɸ := ∗, where ɸ is *args*<sub>*f*</sub>[i] if *e* |
64 |
| -is identical to some *params*<sub>*f*</sub>[i], and *e* otherwise. |
| 83 | +Assuming an assigns clause *c* is a sound characterization of the behavior of the function *f*, a call to *f* may be replaced a series of non-deterministic assignments. For each expression *e* ∈ *exprs*<sub>*c*</sub>, let there be an assignment ɸ := ∗, where ɸ is *args*<sub>*f*</sub>[i] if *e* is identical to some *params*<sub>*f*</sub>[i], and *e* otherwise. |
| 84 | +For array ranges, create an assignment for each element in the array range. |
| 85 | + |
| 86 | +In the case where *f* is `foo`, and some function `bar` calls `foo` |
| 87 | +``` |
| 88 | +void bar() |
| 89 | +{ |
| 90 | + int a, b; |
| 91 | + int arr[8]; |
| 92 | + struct pair p; |
| 93 | +
|
| 94 | + foo(&a, &b, &p, arr, 7); |
| 95 | +} |
| 96 | +``` |
| 97 | +the call to `foo` is replaced by the following series of assignments. |
| 98 | +``` |
| 99 | +void bar() |
| 100 | +{ |
| 101 | + int a, b; |
| 102 | + int arr[10]; |
| 103 | + struct pair p; |
| 104 | +
|
| 105 | + a = *; |
| 106 | + p.y = *; |
| 107 | + arr[5] = *; |
| 108 | + arr[6] = *; |
| 109 | + arr[7] = *; |
| 110 | +} |
| 111 | +``` |
65 | 112 |
|
66 | 113 |
|
67 | 114 | ###### Enforcement
|
68 |
| -In order to determine whether an assigns clause *c* is a sound characterization |
69 |
| -of the behavior of a function *f*, the body of the function should be |
70 |
| -instrumented with additional statements as follows. |
71 |
| - |
72 |
| -- For each expression *e* ∈ *exprs*<sub>*c*</sub>, create a temporary |
73 |
| - variable *tmp*<sub>*e*</sub> to store \&(*e*), the address of *e*, at the |
74 |
| -start of *f*. |
75 |
| -- Before each assignment statement, *lhs* := *rhs*, add an assertion (structured |
76 |
| - as a disjunction) |
77 |
| -assert(∃ *tmp*<sub>*e*</sub>. \_\_CPROVER\_same\_object(\&(*lhs*), |
78 |
| -*tmp*<sub>*e*</sub>) |
79 |
| -- Before each function call with an assigns clause *a*, add an assertion for |
80 |
| - each *e*<sub>*a*</sub> ∈ *exprs*<sub>*a*</sub> (also formulated as a |
81 |
| -disjunction) |
82 |
| -assert(∃ *tmp*<sub>*e*</sub>. |
83 |
| -\_\_CPROVER\_same\_object(\&(*e*<sub>*a*</sub>), *tmp*<sub>*e*</sub>) |
84 |
| - |
85 |
| -Here, \_\_CPROVER\_same\_object returns true if two pointers |
86 |
| -reference the same object in the CBMC memory model. Additionally, for each |
87 |
| -function call without an assigns clause, CBMC adds an assertion assert(*false*) |
88 |
| -to ensure that the assigns contract will only hold if all called functions |
89 |
| -have an assigns clause which is compatible with that of the calling function. |
| 115 | +In order to determine whether an assigns clause *c* is a sound characterization of the behavior of a function *f*, |
| 116 | +the body of the function is instrumented with assertion statements before each statement which may write to memory (e.g. an assignment). |
| 117 | +These assertions are based on the assignable targets *exprs*<sub>*c*</sub> identified in the assigns clause. |
| 118 | + |
| 119 | +While sequentially traversing the function body, statements which may modify memory are instrumented with a preceeding assertion to ensure that they only modify memory assignable memory. |
| 120 | +These assertions consist of a disjunction where each disjunct depends on the type of the assignable target. |
| 121 | + |
| 122 | +- For an assignable target expression *e* of scalar type: |
| 123 | +``` |
| 124 | +__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(&e) && |
| 125 | +__CPROVER_POINTER_OFFSET(&lhs) == __CPROVER_POINTER_OFFSET(&e) |
| 126 | +``` |
| 127 | + |
| 128 | +- For an assignable target expression *e* of struct type, a similar disjunct is created, with an additional size comparison: |
| 129 | +``` |
| 130 | +__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(&e) && |
| 131 | +__CPROVER_POINTER_OFFSET(&lhs) == __CPROVER_POINTER_OFFSET(&e) && |
| 132 | +sizeof(lhs) == sizeof(e) |
| 133 | +``` |
| 134 | +Additionally, for each member m of *e*, we add an extra disjunct: |
| 135 | +``` |
| 136 | +__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(&(e->m)) && |
| 137 | +__CPROVER_POINTER_OFFSET(&lhs) == __CPROVER_POINTER_OFFSET(&(e->m)) && |
| 138 | +sizeof(lhs) == sizeof(e->m) |
| 139 | +``` |
| 140 | +This extension is performed recursively in the case that some member of a struct is also a struct. |
| 141 | + |
| 142 | +- For an assignable target expression *e* of array range type, `a[lo, hi]`, a disjunct is created using the array pointer `a` and the range bounds `lo` and `hi`: |
| 143 | +``` |
| 144 | +__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(a) && |
| 145 | +__CPROVER_POINTER_OFFSET(&lhs) >= lo * sizeof(a[0]) && |
| 146 | +__CPROVER_POINTER_OFFSET(&lhs) <= hi * sizeof(a[0]) |
| 147 | +``` |
| 148 | + |
| 149 | +This effectively creates an assertion that the memory location being assigned falls within the assignable memory identified by the assigns clause. |
| 150 | +In practice, temporary variables are used to ensure that the assignable memory doesn't change during function execution, but here we have used the assignable targets directly for readability. |
| 151 | +Function calls are handled similarly, based on the memory their assigns clause indicates that they can modify. |
| 152 | +Additionally, for each function call without an assigns clause, we add an assertion assert(*false*) to ensure that the assigns contract will only hold if all called functions have an assigns clause which is compatible with that of the calling function. |
| 153 | + |
| 154 | +For the earlier example, `foo` would be instrumented as follows: |
| 155 | + |
| 156 | +``` |
| 157 | +void foo(int *myInt1, int *myInt2, struct pair *myPair, int myArr[], int lastIdx) |
| 158 | +{ |
| 159 | + int *temp1 = myInt1; |
| 160 | + int *temp2 = &(myPair->y); |
| 161 | + int *temp3 = myArr; |
| 162 | + int temp4 = 5 * sizeof(iint); |
| 163 | + int temp5 = lastIdx * sizeof(int); |
| 164 | +
|
| 165 | + assert((__CPROVER_POINTER_OBJECT(myInt1) == __CPROVER_POINTER_OBJECT(temp1) && |
| 166 | + __CPROVER_POINTER_OFFSET(myInt1) == __CPROVER_POINTER_OFFSET(temp1)) || |
| 167 | + (__CPROVER_POINTER_OBJECT(myInt1) == __CPROVER_POINTER_OBJECT(temp2) && |
| 168 | + __CPROVER_POINTER_OFFSET(myInt1) == __CPROVER_POINTER_OFFSET(temp2)) || |
| 169 | + (__CPROVER_POINTER_OBJECT(myInt1) == __CPROVER_POINTER_OBJECT(temp3) && |
| 170 | + __CPROVER_POINTER_OFFSET(myInt1) >= temp4 && |
| 171 | + __CPROVER_POINTER_OFFSET(myInt1) <= temp5) |
| 172 | + ); |
| 173 | + *myInt1 = 34; |
| 174 | +
|
| 175 | + assert((__CPROVER_POINTER_OBJECT(&(myPair->y)) == __CPROVER_POINTER_OBJECT(temp1) && |
| 176 | + __CPROVER_POINTER_OFFSET(&(myPair->y)) == __CPROVER_POINTER_OFFSET(temp1)) || |
| 177 | + (__CPROVER_POINTER_OBJECT(&(myPair->y)) == __CPROVER_POINTER_OBJECT(temp2) && |
| 178 | + __CPROVER_POINTER_OFFSET(&(myPair->y)) == __CPROVER_POINTER_OFFSET(temp2)) || |
| 179 | + (__CPROVER_POINTER_OBJECT(&(myPair->y)) == __CPROVER_POINTER_OBJECT(temp3) && |
| 180 | + __CPROVER_POINTER_OFFSET(&(myPair->y)) >= temp4 && |
| 181 | + __CPROVER_POINTER_OFFSET(&(myPair->y)) <= temp5) |
| 182 | + ); |
| 183 | + myPair->y = 55; |
| 184 | +
|
| 185 | + assert((__CPROVER_POINTER_OBJECT(&(myArr[5])) == __CPROVER_POINTER_OBJECT(temp1) && |
| 186 | + __CPROVER_POINTER_OFFSET(&(myArr[5])) == __CPROVER_POINTER_OFFSET(temp1)) || |
| 187 | + (__CPROVER_POINTER_OBJECT(&(myArr[5])) == __CPROVER_POINTER_OBJECT(temp2) && |
| 188 | + __CPROVER_POINTER_OFFSET(&(myArr[5])) == __CPROVER_POINTER_OFFSET(temp2)) || |
| 189 | + (__CPROVER_POINTER_OBJECT(&(myArr[5])) == __CPROVER_POINTER_OBJECT(temp3) && |
| 190 | + __CPROVER_POINTER_OFFSET(&(myArr[5])) >= temp4 && |
| 191 | + __CPROVER_POINTER_OFFSET(&(myArr[5])) <= temp5) |
| 192 | + ); |
| 193 | + myArr[5] = 89; |
| 194 | +} |
| 195 | +``` |
| 196 | + |
| 197 | +Additionally, the set of assignable target expressions is updated while traversing the function body when new memory is allocated. |
| 198 | +For example, the statement `x = (int *)malloc(sizeof(int))` would create a pointer, stored in `x`, to assignable memory. Since the memory is allocated within the current function, there is no way an assignment to this memory can affect the memory of the calling context. |
| 199 | +If memory is allocated for a struct, the subcomponents are considered assignable as well. |
| 200 | + |
| 201 | +Finally, a set of freely-assignable symbols *free* is tracked during the traversal of the function body. These are locally-defined variables and formal parameters without dereferences. |
| 202 | +For example, in a variable declaration `<type> x = <initial_value>`, `x` would be added to *free*. Assignment statements where the left-hand-side is in *free* are not instrumented with the above assertions. |
0 commit comments