Skip to content

Commit 151dfd2

Browse files
committed
Add goto-harness documentation to cprover-manual
This adds some user documentation to goto-harness. Currently only includes documentation for the call-function harness, not the memory-snapshot harness
1 parent 6f20646 commit 151dfd2

File tree

1 file changed

+245
-0
lines changed

1 file changed

+245
-0
lines changed

doc/cprover-manual/goto-harness.md

Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,245 @@
1+
[CPROVER Manual TOC](../../)
2+
3+
## Goto Harness
4+
5+
This is a tool that can generate harnesses, that is functions
6+
that instrument another function under analysis, by setting
7+
up an appropriate environment before calling them.
8+
9+
This is useful when trying to analyse an isolated unit of a
10+
program without having to manually construct an appropriate
11+
environment.
12+
13+
### Usage
14+
15+
We have two types of harnesses we can generate for now:
16+
17+
* The `memory-snapshot` harness. TODO: Daniel can document
18+
this.
19+
* The `function-call` harness, which automatically synthesises
20+
an environment without having any information about the program state.
21+
22+
It is used similar to how `goto-instrument` is used by modifying
23+
an existing GOTO binary, as produced by `goto-cc`.
24+
25+
### The function call harness generator
26+
27+
For the most basic use of the `function-call` harness generator,
28+
imagine that we have the following C program:
29+
30+
``` C
31+
// main.c
32+
33+
#include <assert.h>
34+
35+
int function_to_test(int some_argument)
36+
{
37+
assert(some_argument == 0);
38+
return some_argument;
39+
}
40+
```
41+
42+
We first need to generate a GOTO binary.
43+
44+
```sh
45+
$ goto-cc -o main.gb main.c
46+
```
47+
48+
Then we can call `goto-harness` on the generated GOTO binary
49+
to get a modified one that contains the harness function:
50+
51+
```sh
52+
$ goto-harness \
53+
--harness-function-name harness \
54+
--harness-type call-function \
55+
--function function_to_test \
56+
main.gb \
57+
main-with-harness.gb
58+
```
59+
60+
The options we pass to `goto-harness` are:
61+
62+
* `harness-function-name` is the name of the function generated by the harness generator (this needs to be specified for all harness generators).
63+
* `harness-type` is the type of harness to generate (`call-function` is the function-call harness generator)
64+
* `function` is the name of the function that gets instrumented
65+
* then we pass the input GOTO-binary and a name for the output GOTO binary.
66+
67+
What comes of this is a GOTO binary the following C pseudocode:
68+
69+
```C
70+
// modified_goto_binary.c
71+
72+
#include <assert.h>
73+
74+
int function_to_test(int some_argument)
75+
{
76+
assert(some_argument == 0);
77+
return some_argument;
78+
}
79+
80+
void harness(void)
81+
{
82+
int argument = nondet_int();
83+
function_to_test(argument);
84+
}
85+
```
86+
87+
You can pass that modified GOTO binary to `cbmc` and select
88+
`harness` as the entry function for analysis, or further modify
89+
it with `goto-harness` or `goto-instrument` or other tools.
90+
91+
The example above demonstrates the most simple case, which is roughly the same as the entry point `cbmc` automically generates for functions. However, the `function-call` harness can also non-deterministically
92+
initialise global variables, array and struct elements, consider this more complicated example:
93+
94+
```C
95+
// list_example.c
96+
#include <assert.h>
97+
#include <stdlib.h>
98+
99+
typedef struct linked_list linked_list;
100+
struct linked_list {
101+
int value;
102+
linked_list *next;
103+
};
104+
105+
linked_list *global_linked_list;
106+
107+
/// initialize all values in the global
108+
/// list to 0
109+
void initialize_global(void)
110+
{
111+
for(linked_list *ll = global_linked_list;
112+
ll != NULL;
113+
ll = ll->next)
114+
{
115+
ll->value = 0;
116+
}
117+
}
118+
119+
/// try to initialize all values in the linked list
120+
/// to 0 - but this version contains two bugs,
121+
/// as it won't work with nullpointer arguments
122+
/// and it will also not initialize the last element
123+
void initialize_other(linked_list *ll)
124+
{
125+
do {
126+
ll->value = 0;
127+
ll = ll->next;
128+
} while(ll->next != NULL);
129+
}
130+
131+
void check_list(linked_list *list_parameter)
132+
{
133+
assert(list_parameter != global_linked_list);
134+
initialize_global();
135+
initialize_other(list_parameter);
136+
linked_list *global_cursor = global_linked_list;
137+
linked_list *parameter_cursor = list_parameter;
138+
139+
// global list should be a prefix of the parameter now,
140+
// or the other way round
141+
while(global_cursor != NULL && parameter_cursor != NULL)
142+
{
143+
// this assertion should fail since we didn't
144+
// initialize the last element of of the
145+
// list parameter correctly
146+
assert(global_cursor->value
147+
== parameter_cursor->value);
148+
global_cursor = global_cursor->next;
149+
parameter_cursor = parameter_cursor->next;
150+
}
151+
}
152+
```
153+
154+
Now we'll try to find the bug in `check_list` by generating a harness for it.
155+
156+
```sh
157+
$ goto-cc -o list_example.gb list_example.c
158+
$ goto-harness \
159+
--harness-function-name harness \
160+
--harness-type call-function \
161+
--function check_list
162+
--max-nondet-tree-depth 4 \
163+
--min-null-tree-depth 1 \
164+
--nondet-globals \
165+
list_example.gb \
166+
list_example_with_harness.gb
167+
$ cbmc --function harness list_example_with_harness.gb --unwind 20 --unwinding-assertions
168+
```
169+
170+
We have 3 new options here:
171+
172+
* `max-nondet-tree-depth` is the maximum extent to which we will generate and initialize non-null pointers - in this case, this means generating lists up to length 2
173+
* `min-null-tree-depth` this is the minimum depth at which pointers can be nullpointers for generated values - in this case, this sets the _minimum_ length for our linked lists to one.
174+
* `nondet-globals` is non-deterministically initialising global variables.
175+
176+
```
177+
CBMC version 5.11 (cbmc-5.11-1523-g419a958-dirty) 64-bit x86_64 linux
178+
[...]
179+
180+
** Results:
181+
example.c function initialize_global
182+
[initialize_global.unwind.0] line 17 unwinding assertion loop 0: SUCCESS
183+
184+
example.c function initialize_other
185+
[initialize_other.unwind.0] line 32 unwinding assertion loop 0: SUCCESS
186+
187+
example.c function check_list
188+
[check_list.assertion.1] line 42 assertion list_parameter != global_linked_list: SUCCESS
189+
[check_list.unwind.0] line 50 unwinding assertion loop 0: SUCCESS
190+
[check_list.assertion.2] line 55 assertion global_cursor->value == parameter_cursor->value: FAILURE
191+
192+
** 1 of 5 failed (2 iterations)
193+
VERIFICATION FAILED
194+
```
195+
196+
We also have support for arrays (currently only for array function parameters, globals and struct members are considered for the future).
197+
198+
Take this example of a an implementation of an `is_prefix_of` function that should return true if the first string parameter `string` is a prefix of the second one `prefix`.
199+
200+
```c
201+
// array_example.c
202+
203+
int is_prefix_of(
204+
const char *string,
205+
int string_length,
206+
const char *prefix,
207+
int prefix_length
208+
)
209+
{
210+
if(prefix_length > string_length) { return 0; }
211+
// oops, we should have used prefix_length here
212+
for(int i = 0; i < string_length; ++i)
213+
{
214+
// we'll get an out of bounds error here!
215+
if(prefix[i] != string[i]) { return 0; }
216+
}
217+
return 1;
218+
}
219+
```
220+
221+
We can compile and run it like this:
222+
223+
```sh
224+
$ goto-cc -o array_example.gb array_example.c
225+
$ goto-harness \
226+
--harness-function-name harness \
227+
--harness-type call-function \
228+
--function is_prefix_of \
229+
--associated_array-size string:string_length \
230+
--associated-array-size prefix:prefix_length \
231+
array_example.gb array_example-mod.gb
232+
$ cbmc --function harness --pointer-check array_example-mod.gb
233+
```
234+
235+
We have the new option `associated-array-size` here. This is in the format `<array-parameter-name>:<array-size-parameter-name>` and will cause the array parameter with name`array-parameter-name` to be initialised as an array, with the parameter `array-size-parameter-name` should hold its size (it should have some integral type like `int` or `size_t`).
236+
237+
Running the example shows the bug highlighted in the comments:
238+
```
239+
[...]
240+
[is_prefix_of.pointer_dereference.6] line 14 dereference failure: pointer outside object bounds in prefix[(signed long int)i]: FAILURE
241+
```
242+
243+
By default, arrays are created with a minimum size of 1 and a maximum size of 2. These limits can be set with the `--min-array-size` and `--max-array-size` options.
244+
245+
If you have a function that takes and array parameter, but without an associated size parameter, you can also use the `--treat-pointer-as-array <parameter-name>` option.

0 commit comments

Comments
 (0)