Skip to content

Commit 0ea5c31

Browse files
committed
Add goto-harness documentation to cprover-manual
This adds some user documentation to goto-harness. Currently only includes documentation for the call-function harness, not the memory-snapshot harness
1 parent bdcc2b1 commit 0ea5c31

File tree

1 file changed

+263
-0
lines changed

1 file changed

+263
-0
lines changed

doc/cprover-manual/goto-harness.md

Lines changed: 263 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,263 @@
1+
[CPROVER Manual TOC](../../)
2+
3+
## Goto Harness
4+
5+
This is a tool that can generate harnesses, that is functions that instrument
6+
another function under analysis, by setting up an appropriate environment before
7+
calling them.
8+
9+
This is useful when trying to analyse an isolated unit of a program without
10+
having to manually construct an appropriate environment.
11+
12+
### Usage
13+
14+
We have two types of harnesses we can generate for now:
15+
16+
* The `memory-snapshot` harness. TODO: Daniel can document this.
17+
* The `function-call` harness, which automatically synthesises an environment
18+
without having any information about the program state.
19+
20+
It is used similarly to how `goto-instrument` is used by modifying an existing
21+
GOTO binary, as produced by `goto-cc`.
22+
23+
### The function call harness generator
24+
25+
For the most basic use of the `function-call` harness generator, imagine that we
26+
have the following C program:
27+
28+
``` C
29+
// main.c
30+
31+
#include <assert.h>
32+
33+
int function_to_test(int some_argument)
34+
{
35+
assert(some_argument == 0);
36+
return some_argument;
37+
}
38+
```
39+
40+
We first need to generate a GOTO binary.
41+
42+
```sh
43+
$ goto-cc -o main.gb main.c
44+
```
45+
46+
Then we can call `goto-harness` on the generated GOTO binary
47+
to get a modified one that contains the harness function:
48+
49+
```sh
50+
$ goto-harness \
51+
--harness-function-name harness \
52+
--harness-type call-function \
53+
--function function_to_test \
54+
main.gb \
55+
main-with-harness.gb
56+
```
57+
58+
The options we pass to `goto-harness` are:
59+
60+
* `harness-function-name` is the name of the function generated by the harness
61+
generator (this needs to be specified for all harness generators).
62+
* `harness-type` is the type of harness to generate (`call-function` is the
63+
function-call harness generator)
64+
* `function` is the name of the function that gets instrumented
65+
* then we pass the input GOTO-binary and a name for the output GOTO binary.
66+
67+
What comes of this is a GOTO binary roughly corresponding to the following C
68+
pseudocode:
69+
70+
```C
71+
// modified_goto_binary.c
72+
73+
#include <assert.h>
74+
75+
int function_to_test(int some_argument)
76+
{
77+
assert(some_argument == 0);
78+
return some_argument;
79+
}
80+
81+
void harness(void)
82+
{
83+
int argument = nondet_int();
84+
function_to_test(argument);
85+
}
86+
```
87+
88+
You can pass that modified GOTO binary to `cbmc` and select `harness` as the
89+
entry function for analysis, or further modify it with `goto-harness` or
90+
`goto-instrument` or other tools.
91+
92+
The example above demonstrates the most simple case, which is roughly the same
93+
as the entry point `cbmc` automically generates for functions. However, the
94+
`function-call` harness can also non-deterministically initialise global
95+
variables, array and struct elements. Consider this more complicated example:
96+
97+
```C
98+
// list_example.c
99+
#include <assert.h>
100+
#include <stdlib.h>
101+
102+
typedef struct linked_list linked_list;
103+
struct linked_list {
104+
int value;
105+
linked_list *next;
106+
};
107+
108+
linked_list *global_linked_list;
109+
110+
/// initialize all values in the global
111+
/// list to 0
112+
void initialize_global(void)
113+
{
114+
for(linked_list *ll = global_linked_list;
115+
ll != NULL;
116+
ll = ll->next)
117+
{
118+
ll->value = 0;
119+
}
120+
}
121+
122+
/// try to initialize all values in the linked list
123+
/// to 0 - but this version contains two bugs,
124+
/// as it won't work with nullpointer arguments
125+
/// and it will also not initialize the last element
126+
void initialize_other(linked_list *ll)
127+
{
128+
do {
129+
ll->value = 0;
130+
ll = ll->next;
131+
} while(ll->next != NULL);
132+
}
133+
134+
void check_list(linked_list *list_parameter)
135+
{
136+
assert(list_parameter != global_linked_list);
137+
initialize_global();
138+
initialize_other(list_parameter);
139+
linked_list *global_cursor = global_linked_list;
140+
linked_list *parameter_cursor = list_parameter;
141+
142+
// global list should be a prefix of the parameter now,
143+
// or the other way round
144+
while(global_cursor != NULL && parameter_cursor != NULL)
145+
{
146+
// this assertion should fail since we didn't
147+
// initialize the last element of of the
148+
// list parameter correctly
149+
assert(global_cursor->value
150+
== parameter_cursor->value);
151+
global_cursor = global_cursor->next;
152+
parameter_cursor = parameter_cursor->next;
153+
}
154+
}
155+
```
156+
157+
Now we'll try to find the bug in `check_list` by generating a harness for it.
158+
159+
```sh
160+
$ goto-cc -o list_example.gb list_example.c
161+
$ goto-harness \
162+
--harness-function-name harness \
163+
--harness-type call-function \
164+
--function check_list \
165+
--max-nondet-tree-depth 4 \
166+
--min-null-tree-depth 1 \
167+
--nondet-globals \
168+
list_example.gb \
169+
list_example_with_harness.gb
170+
$ cbmc --function harness list_example_with_harness.gb --unwind 20 --unwinding-assertions
171+
```
172+
173+
We have 3 new options here:
174+
175+
* `max-nondet-tree-depth` is the maximum extent to which we will generate and
176+
initialize non-null pointers - in this case, this means generating lists up to
177+
length 2
178+
* `min-null-tree-depth` this is the minimum depth at which pointers can be
179+
nullpointers for generated values - in this case, this sets the _minimum_
180+
length for our linked lists to one.
181+
* `nondet-globals` is non-deterministically initialising global variables.
182+
183+
```
184+
CBMC version 5.11 (cbmc-5.11-1523-g419a958-dirty) 64-bit x86_64 linux
185+
[...]
186+
187+
** Results:
188+
example.c function initialize_global
189+
[initialize_global.unwind.0] line 17 unwinding assertion loop 0: SUCCESS
190+
191+
example.c function initialize_other
192+
[initialize_other.unwind.0] line 32 unwinding assertion loop 0: SUCCESS
193+
194+
example.c function check_list
195+
[check_list.assertion.1] line 42 assertion list_parameter != global_linked_list: SUCCESS
196+
[check_list.unwind.0] line 50 unwinding assertion loop 0: SUCCESS
197+
[check_list.assertion.2] line 55 assertion global_cursor->value == parameter_cursor->value: FAILURE
198+
199+
** 1 of 5 failed (2 iterations)
200+
VERIFICATION FAILED
201+
```
202+
203+
We also have support for arrays (currently only for array function parameters,
204+
globals and struct members are considered for the future).
205+
206+
Take this example of an implementation of an `is_prefix_of` function that
207+
should return true if the first string parameter `string` is a prefix of the
208+
second one `prefix`.
209+
210+
```c
211+
// array_example.c
212+
213+
int is_prefix_of(
214+
const char *string,
215+
int string_length,
216+
const char *prefix,
217+
int prefix_length
218+
)
219+
{
220+
if(prefix_length > string_length) { return 0; }
221+
// oops, we should have used prefix_length here
222+
for(int i = 0; i < string_length; ++i)
223+
{
224+
// we'll get an out of bounds error here!
225+
if(prefix[i] != string[i]) { return 0; }
226+
}
227+
return 1;
228+
}
229+
```
230+
231+
We can compile and run it like this:
232+
233+
```sh
234+
$ goto-cc -o array_example.gb array_example.c
235+
$ goto-harness \
236+
--harness-function-name harness \
237+
--harness-type call-function \
238+
--function is_prefix_of \
239+
--associated_array-size string:string_length \
240+
--associated-array-size prefix:prefix_length \
241+
array_example.gb array_example-mod.gb
242+
$ cbmc --function harness --pointer-check array_example-mod.gb
243+
```
244+
245+
We have the additional option `associated-array-size` here. This is in the format
246+
`<array-parameter-name>:<array-size-parameter-name>` and will cause the array
247+
parameter with name`array-parameter-name` to be initialised as an array, with
248+
the parameter `array-size-parameter-name` holding its size (it should have
249+
some integral type like `int` or `size_t`).
250+
251+
Running the example shows the bug highlighted in the comments:
252+
```
253+
[...]
254+
[is_prefix_of.pointer_dereference.6] line 14 dereference failure: pointer outside object bounds in prefix[(signed long int)i]: FAILURE
255+
```
256+
257+
By default, arrays are created with a minimum size of 1 and a maximum size of 2.
258+
These limits can be set with the `--min-array-size` and `--max-array-size`
259+
options.
260+
261+
If you have a function that takes an array parameter, but without an associated
262+
size parameter, you can also use the `--treat-pointer-as-array <parameter-name>`
263+
option.

0 commit comments

Comments
 (0)