Skip to content

Commit ab95f8d

Browse files
committed
Updates assigns clause description in CPROVER manual
Signed-off-by: Felipe R. Monteiro <[email protected]>
1 parent 84c7e7e commit ab95f8d

File tree

1 file changed

+232
-55
lines changed

1 file changed

+232
-55
lines changed

doc/cprover-manual/assigns-clause.md

Lines changed: 232 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,104 @@
11
## CBMC Assigns Clause
22

3-
43
### Introduction
5-
The _assigns_ clause allows the user to specify a list of memory
6-
locations which may be written within a particular scope. While an assigns
7-
clause may, in general, be interpreted with either _writes_ or _modifies_
4+
Here is described the syntax and semantics of the "assigns"
5+
clause as implemented in CBMC. This construct allows the user to specify a list
6+
of memory locations which may be written within a function. While an assigns
7+
clause may, in general, be interpreted with either "writes" or "modifies"
88
semantics, this design is based on the former. This means that memory not
99
captured by the assigns clause must not be written within the given scope, even
10-
if the value(s) therein are not modified.
11-
10+
if the value(s) are restored before the function returns.
1211

13-
### Scalar Variables
14-
The initial iteration of this design focuses on specifying an assigns clause for
15-
primitive types and their pointers. Arrays, structured data, and pointers are
16-
left to future contributions.
12+
The latest iteration of this design focuses on specifying assigns clauses for
13+
simple variable types and their pointers, structs, and arrays. Recursive
14+
pointer structures are left to future work, as the current memory model does not
15+
allow them to be supported by the assigns clause.
1716

1817

1918
##### Syntax
20-
A special construct is introduced to specify assigns clauses. Its syntax is
21-
defined as follows.
19+
A special construct is introduced to specify assigns clauses.
20+
At the time of writing, its syntax is defined as follows.
2221

2322
```
2423
<assigns_clause> := __CPROVER_assigns ( <target_list> )
2524
```
2625
```
27-
<target_list> := <target>
28-
| <target> , <target_list>
26+
<target_list> := __CPROVER_nothing
27+
| <target>
28+
| <target_list> , <target>
29+
```
30+
```
31+
<target> := <deref_target>
32+
| <target> [ array_index ]
33+
| <target> [ array_range ]
34+
35+
<array_index> := <identifier> | <integer>
36+
<array_range> := <array_index> , <array_index>
37+
```
38+
```
39+
<deref_target> := <member_target>
40+
| * <deref_target>
41+
```
2942
```
43+
<member_target> := <primary_target>
44+
| <member_target> . <member_name>
45+
| <member_target> -> <member_name>
3046
```
31-
<target> := <identifier>
32-
| * <target>
3347
```
48+
<primary_target> := <identifier>
49+
| ( <target> )
50+
```
51+
52+
The assigns clause identifies a list of targets, which may be an identifier
53+
(e.g. `x`), a member of a struct (e.g. `myStruct.someMember` or
54+
`myStructPtr->otherMember`) , a dereferenced pointer (e.g. `*myPtr`), or an
55+
array range (e.g. `myArr[5, lastIdx]`). The `<assigns_clause>` states that only
56+
memory identified in the target list may be written within the function, as
57+
described in the next section.
3458

59+
For example, consider the following declarations.
3560

36-
The `<assigns_clause>` states that only the memory identified by the dereference
37-
expressions and identifiers listed in the contained `<target_list>` may be
38-
written within the associated scope, as follows.
61+
```
62+
struct pair
63+
{
64+
int x;
65+
int y;
66+
};
3967
68+
void foo(int *myInt1, int *myInt2, struct pair *myPair, int myArr[], int lastIdx)
69+
__CPROVER_assigns(*myInt1, myPair->y, myArr[5, lastIdx]);
70+
```
71+
72+
This code declares a struct type, called `pair` with 2 members `x` and `y`,
73+
as well as a function `foo` which takes two integer pointer arguments
74+
`myInt1` and `myInt2`, one pair struct pointer `myPair`, and an array
75+
argument `myArr` with an associated last index `lastIdx`. The assigns
76+
clause attached to this definition states that most of the arguments
77+
can be assigned. However, `myInt2`, `myPair->x`, and the first four
78+
elements of `myArr` should not be assigned in the body of `foo`.
4079

4180
##### Semantics
42-
The semantics of an assigns clause *c* of some function *f* can be understood in
43-
two contexts. First, one may consider how the expressions in *c* are treated
44-
when a call to *f* is replaced by its contract specification, assuming the
45-
contract specification is a sound characterization of the behavior of *f*.
46-
Second, one may consider how *c* is applied to the function body of *f* in order
47-
to determine whether *c* is a sound characterization of the behavior of *f*. We
48-
begin by exploring these two perspectives for assigns clauses which contain only
49-
scalar expressions.
81+
The semantics of an assigns clause *c* of some function *f* can
82+
be understood in two contexts. First, one may consider how the expressions in
83+
*c* are treated when a call to *f* is replaced by its contract specification,
84+
assuming the contract specification is a sound characterization of the behavior
85+
of *f*. Second, one may consider how *c* is applied to the function body of *f*
86+
in order to determine whether *c* is a sound characterization of the behavior of
87+
*f*. To illustrate these two perspectives, consider the following definition of
88+
`foo`.
89+
90+
91+
```
92+
void foo(int *myInt1, int *myInt2, struct pair *myPair, int myArr[], int lastIdx)
93+
{
94+
*myInt1 = 34;
95+
myPair->y = 55;
96+
myArr[5] = 89;
97+
}
98+
```
99+
100+
We begin by exploring these two perspectives with a formal definition, and then
101+
an example.
50102

51103
Let the i<sup>th</sup> expression in some assigns clause *c* be denoted
52104
*exprs*<sub>*c*</sub>[i], the j<sup>th</sup> formal parameter of some function
@@ -55,35 +107,160 @@ in a call to function *f* be denoted *args*<sub>*f*</sub>[k] (an identifying
55107
index may be added to distinguish a *particular* call to *f*, but for simplicity
56108
it is omitted here).
57109

58-
59110
###### Replacement
60-
Assuming an assigns clause *c* is a sound characterization of the behavior of
61-
the function *f*, a call to *f* may be replaced a series of non-deterministic
62-
assignments. For each expression *e* &#8712; *exprs*<sub>*c*</sub>, let there be
63-
an assignment &#632; := &#8727;, where &#632; is *args*<sub>*f*</sub>[i] if *e*
64-
is identical to some *params*<sub>*f*</sub>[i], and *e* otherwise.
111+
Assuming an assigns clause *c* is a sound characterization of
112+
the behavior of the function *f*, a call to *f* may be replaced a series of
113+
non-deterministic assignments. For each expression *e* &#8712;
114+
*exprs*<sub>*c*</sub>, let there be an assignment &#632; := &#8727;, where
115+
&#632; is *args*<sub>*f*</sub>[i] if *e* is identical to some
116+
*params*<sub>*f*</sub>[i], and *e* otherwise. For array ranges, create an
117+
assignment for each element in the array range.
118+
119+
In the case where *f* is `foo`, and some function `bar` calls `foo`
120+
121+
```
122+
void bar()
123+
{
124+
int a, b;
125+
int arr[8];
126+
struct pair p;
65127
128+
foo(&a, &b, &p, arr, 7);
129+
}
130+
```
131+
132+
the call to `foo` is replaced by the following series of assignments.
133+
134+
```
135+
void bar()
136+
{
137+
int a, b;
138+
int arr[10];
139+
struct pair p;
140+
141+
a = *;
142+
p.y = *;
143+
arr[5] = *;
144+
arr[6] = *;
145+
arr[7] = *;
146+
}
147+
```
66148

67149
###### Enforcement
68-
In order to determine whether an assigns clause *c* is a sound characterization
69-
of the behavior of a function *f*, the body of the function should be
70-
instrumented with additional statements as follows.
71-
72-
- For each expression *e* &#8712; *exprs*<sub>*c*</sub>, create a temporary
73-
variable *tmp*<sub>*e*</sub> to store \&(*e*), the address of *e*, at the
74-
start of *f*.
75-
- Before each assignment statement, *lhs* := *rhs*, add an assertion (structured
76-
as a disjunction)
77-
assert(&#8707; *tmp*<sub>*e*</sub>. \_\_CPROVER\_same\_object(\&(*lhs*),
78-
*tmp*<sub>*e*</sub>)
79-
- Before each function call with an assigns clause *a*, add an assertion for
80-
each *e*<sub>*a*</sub> &#8712; *exprs*<sub>*a*</sub> (also formulated as a
81-
disjunction)
82-
assert(&#8707; *tmp*<sub>*e*</sub>.
83-
\_\_CPROVER\_same\_object(\&(*e*<sub>*a*</sub>), *tmp*<sub>*e*</sub>)
84-
85-
Here, \_\_CPROVER\_same\_object returns true if two pointers
86-
reference the same object in the CBMC memory model. Additionally, for each
87-
function call without an assigns clause, CBMC adds an assertion assert(*false*)
88-
to ensure that the assigns contract will only hold if all called functions
89-
have an assigns clause which is compatible with that of the calling function.
150+
In order to determine whether an assigns clause *c* is a
151+
sound characterization of the behavior of a function *f*, the body of the
152+
function is instrumented with assertion statements before each statement which
153+
may write to memory (e.g. an assignment). These assertions are based on the
154+
assignable targets *exprs*<sub>*c*</sub> identified in the assigns clause.
155+
156+
While sequentially traversing the function body, CBMC instruments statements
157+
that might modify memory a preceding assertion to ensure that they only
158+
modify assignable memory. These assertions consist of a disjunction
159+
where each disjunct depends on the type of the assignable target.
160+
161+
- For an assignable target expression *e* of scalar type:
162+
163+
```
164+
__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(&e) &&
165+
__CPROVER_POINTER_OFFSET(&lhs) == __CPROVER_POINTER_OFFSET(&e)
166+
```
167+
168+
- For an assignable target expression *e* of struct type, a similar disjunct is
169+
created, with an additional size comparison:
170+
171+
```
172+
__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(&e) &&
173+
__CPROVER_POINTER_OFFSET(&lhs) == __CPROVER_POINTER_OFFSET(&e) &&
174+
sizeof(lhs) == sizeof(e)
175+
```
176+
177+
Additionally, for each member m of *e*, we add an extra disjunct:
178+
179+
```
180+
__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(&(e->m)) &&
181+
__CPROVER_POINTER_OFFSET(&lhs) == __CPROVER_POINTER_OFFSET(&(e->m)) &&
182+
sizeof(lhs) == sizeof(e->m)
183+
```
184+
185+
This extension is performed recursively in the
186+
case that some member of a struct is also a struct.
187+
188+
- For an assignable target expression *e* of array range type, `a[lo, hi]`, a
189+
disjunct is created using the array pointer `a` and the range bounds `lo` and
190+
`hi`:
191+
192+
```
193+
__CPROVER_POINTER_OBJECT(&lhs) == __CPROVER_POINTER_OBJECT(a) &&
194+
__CPROVER_POINTER_OFFSET(&lhs) >= lo * sizeof(a[0]) &&
195+
__CPROVER_POINTER_OFFSET(&lhs) <= hi * sizeof(a[0])
196+
```
197+
198+
This effectively creates an assertion that the memory location being assigned
199+
falls within the assignable memory identified by the assigns clause. In
200+
practice, temporary variables are used to ensure that the assignable memory
201+
doesn't change during function execution, but here we have used the assignable
202+
targets directly for readability. Function calls are handled similarly, based
203+
on the memory their assigns clause indicates that they can modify.
204+
Additionally, for each function call without an assigns clause, we add an
205+
assertion assert(*false*) to ensure that the assigns contract will only hold if
206+
all called functions have an assigns clause which is compatible with that of the
207+
calling function.
208+
209+
For the earlier example, `foo` would be instrumented as follows:
210+
211+
212+
```
213+
void foo(int *myInt1, int *myInt2, struct pair *myPair, int myArr[], int lastIdx)
214+
{
215+
int *temp1 = myInt1;
216+
int *temp2 = &(myPair->y);
217+
int *temp3 = myArr;
218+
int temp4 = 5 * sizeof(iint);
219+
int temp5 = lastIdx * sizeof(int);
220+
221+
assert((__CPROVER_POINTER_OBJECT(myInt1) == __CPROVER_POINTER_OBJECT(temp1) &&
222+
__CPROVER_POINTER_OFFSET(myInt1) == __CPROVER_POINTER_OFFSET(temp1)) ||
223+
(__CPROVER_POINTER_OBJECT(myInt1) == __CPROVER_POINTER_OBJECT(temp2) &&
224+
__CPROVER_POINTER_OFFSET(myInt1) == __CPROVER_POINTER_OFFSET(temp2)) ||
225+
(__CPROVER_POINTER_OBJECT(myInt1) == __CPROVER_POINTER_OBJECT(temp3) &&
226+
__CPROVER_POINTER_OFFSET(myInt1) >= temp4 &&
227+
__CPROVER_POINTER_OFFSET(myInt1) <= temp5)
228+
);
229+
*myInt1 = 34;
230+
231+
assert((__CPROVER_POINTER_OBJECT(&(myPair->y)) == __CPROVER_POINTER_OBJECT(temp1) &&
232+
__CPROVER_POINTER_OFFSET(&(myPair->y)) == __CPROVER_POINTER_OFFSET(temp1)) ||
233+
(__CPROVER_POINTER_OBJECT(&(myPair->y)) == __CPROVER_POINTER_OBJECT(temp2) &&
234+
__CPROVER_POINTER_OFFSET(&(myPair->y)) == __CPROVER_POINTER_OFFSET(temp2)) ||
235+
(__CPROVER_POINTER_OBJECT(&(myPair->y)) == __CPROVER_POINTER_OBJECT(temp3) &&
236+
__CPROVER_POINTER_OFFSET(&(myPair->y)) >= temp4 &&
237+
__CPROVER_POINTER_OFFSET(&(myPair->y)) <= temp5)
238+
);
239+
myPair->y = 55;
240+
241+
assert((__CPROVER_POINTER_OBJECT(&(myArr[5])) == __CPROVER_POINTER_OBJECT(temp1) &&
242+
__CPROVER_POINTER_OFFSET(&(myArr[5])) == __CPROVER_POINTER_OFFSET(temp1)) ||
243+
(__CPROVER_POINTER_OBJECT(&(myArr[5])) == __CPROVER_POINTER_OBJECT(temp2) &&
244+
__CPROVER_POINTER_OFFSET(&(myArr[5])) == __CPROVER_POINTER_OFFSET(temp2)) ||
245+
(__CPROVER_POINTER_OBJECT(&(myArr[5])) == __CPROVER_POINTER_OBJECT(temp3) &&
246+
__CPROVER_POINTER_OFFSET(&(myArr[5])) >= temp4 &&
247+
__CPROVER_POINTER_OFFSET(&(myArr[5])) <= temp5)
248+
);
249+
myArr[5] = 89;
250+
}
251+
```
252+
253+
Additionally, the set of assignable target expressions is updated while
254+
traversing the function body when new memory is allocated. For example, the
255+
statement `x = (int *)malloc(sizeof(int))` would create a pointer, stored in
256+
`x`, to assignable memory. Since the memory is allocated within the current
257+
function, there is no way an assignment to this memory can affect the memory of
258+
the calling context. If memory is allocated for a struct, the subcomponents are
259+
considered assignable as well.
260+
261+
Finally, a set of freely-assignable symbols *free* is tracked during the
262+
traversal of the function body. These are locally-defined variables and formal
263+
parameters without dereferences. For example, in a variable declaration `<type>
264+
x = <initial_value>`, `x` would be added to *free*. Assignment statements (i.e.,
265+
where the left-hand-side is in *free*) are not instrumented with
266+
the above assertions.

0 commit comments

Comments
 (0)