Skip to content

Publish documentation #7149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Sep 26, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/architectural/background-concepts.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page background-concepts Background Concepts

\author Martin Brain, Peter Schrammel, Johannes Kloos
Expand Down
1 change: 1 addition & 0 deletions doc/architectural/cbmc-architecture.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page cbmc-architecture CBMC Architecture

\author Martin Brain, Peter Schrammel
Expand Down
1 change: 1 addition & 0 deletions doc/architectural/code-walkthrough.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page code-walkthrough Code Walkthrough

\author Cesar Rodriguez, Owen Jones
Expand Down
1 change: 1 addition & 0 deletions doc/architectural/compilation-and-development.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page compilation-and-development Compilation and Development

\author Martin Brain, Peter Schrammel, Owen Jones
Expand Down
1 change: 1 addition & 0 deletions doc/architectural/folder-walkthrough.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page folder-walkthrough Folder Walkthrough

\author Martin Brain, Peter Schrammel
Expand Down
10 changes: 8 additions & 2 deletions doc/architectural/front-page.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
CProver Developer Documentation
=====================

\page cprover_documentation CProver documentation

These pages contain user tutorials, automatically-generated API
documentation, and higher-level architectural overviews for the
Expand Down Expand Up @@ -66,6 +66,12 @@ Overview of Documentation
you can access it <a href=
"https://svn.cprover.org/wiki/doku.php?id=cprover_tutorial">here</a>.

* \subpage memory-analyzer
* \subpage memory-bounds-checking
* \subpage nondet-volatile
* \subpage restricting-function-pointers
* \subpage satabs

### For contributors:

The following pages attempt to provide the information that a developer needs to
Expand Down
1 change: 1 addition & 0 deletions doc/architectural/howto.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page tutorial Tutorials

\section cbmc_tutorial CBMC Developer Tutorial
Expand Down
3 changes: 1 addition & 2 deletions doc/architectural/memory-analyzer.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
[CPROVER Manual TOC](../../)

## Memory Analyzer
\page memory-analyzer Memory Analyzer

The memory analyzer is a front-end that runs and queries GDB in order to obtain
a snapshot of the state of the input program at a particular program location.
Expand Down
1 change: 1 addition & 0 deletions doc/architectural/memory-bounds-checking.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page memory-bounds-checking Memory Bounds Checking

cbmc provides means to verify that pointers are within the bounds of (statically
Expand Down
136 changes: 136 additions & 0 deletions doc/architectural/nondet-volatile.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
\ingroup module_hidden

\page nondet-volatile Modelling of Volatile Variables

This document describes the options available in goto-instrument for modelling
the behaviour of volatile variables. The following options are provided:

- `--nondet-volatile`
- `--nondet-volatile-variable <variable>`
- `--nondet-volatile-model <variable>:<model>`

By default, cbmc treats volatile variables the same as non-volatile variables.
That is, it assumes that a volatile variable does not change between subsequent
reads, unless it was written to by the program. With the above options,
goto-instrument can be instructed to instrument the given goto program such as
to (1) make reads from all volatile expressions non-deterministic
(`--nondet-volatile`), (2) make reads from specific variables non-deterministic
(`--nondet-volatile-variable`), or (3) model reads from specific variables by
given models (`--nondet-volatile-model`).

Below we give two usage examples for the options. Consider the following test,
for function `get_celsius()` and with harness `test_get_celsius()`:

```C
#include <assert.h>
#include <limits.h>
#include <stdint.h>

// hardware sensor for temperature in kelvin
extern volatile uint16_t temperature;

int get_celsius()
{
if(temperature > (1000 + 273))
{
return INT_MIN; // value indicating error
}

return temperature - 273;
}

void test_get_celsius()
{
int t = get_celsius();

assert(t == INT_MIN || t <= 1000);
assert(t == INT_MIN || t >= -273);
}
```

Here the variable `temperature` corresponds to a hardware sensor. It returns
the current temperature on each read. The `get_celsius()` function converts the
value in Kelvin to degrees Celsius, given the value is in the expected range.
However, it has a bug where it reads `temperature` a second time after the
check, which may yield a value for which the check would not succeed. Verifying
this program as is with cbmc would yield a verification success. We can use
goto-instrument to make reads from `temperature` non-deterministic:

```
goto-cc -o get_celsius_test.gb get_celsius_test.c
goto-instrument --nondet-volatile-variable temperature get_celsius_test.gb get_celsius_test-mod.gb
cbmc --function test_get_celsius get_celsius_test-mod.gb
```

Here the final invocation of cbmc correctly reports a verification failure.

However, simply treating volatile variables as non-deterministic may for some
use cases be too inaccurate. Consider the following test, for function
`get_message()` and with harness `test_get_message()`:

```C
#include <assert.h>
#include <stdint.h>

extern volatile uint32_t clock;

typedef struct message {
uint32_t timestamp;
void *data;
} message_t;

void *read_data();

message_t get_message()
{
message_t msg;

msg.timestamp = clock;
msg.data = read_data();

return msg;
}

void test_get_message()
{
message_t msg1 = get_message();
message_t msg2 = get_message();

assert(msg1.timestamp <= msg2.timestamp);
}
```

The harness verifies that `get_message()` assigns non-decreasing timestamps to
the returned messages. However, simply treating `clock` as non-deterministic
would not allow to prove this property. Thus, we can supply a model for reads
from `clock`:

```C
// model for reads of the variable clock
uint32_t clock_read_model()
{
static uint32_t clock_value = 0;

uint32_t increment;
__CPROVER_assume(increment <= 100);

clock_value += increment;

return clock_value;
}
```

The model is stateful in that it keeps the current clock value between
invocations in the variable `clock_value`. On each invocation, it increments
the clock by a non-determinstic value in the range 0 to 100. We can tell
goto-instrument to use the model `clock_read_model()` for reads from the
variable `clock` as follows:

```
goto-cc -o get_message_test.gb get_message_test.c
goto-instrument --nondet-volatile-model clock:clock_read_model get_message_test.gb get_message_test-mod.gb
cbmc --function get_message_test get_message_test-mod.gb
```

Now the final invocation of cbmc reports verification success.

1 change: 1 addition & 0 deletions doc/architectural/other-tools.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
\ingroup module_hidden

\page other-tools Other Tools

\author Martin Brain, Peter Schrammel
Expand Down
182 changes: 182 additions & 0 deletions doc/architectural/restrict-function-pointer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@

\page restricting-function-pointers Restricting function pointers

In this document, we describe the `goto-instrument` feature to replace calls
through function pointers by case distinctions over calls to given sets of
functions.

### Motivation

The CPROVER framework includes a goto program transformation pass
`remove_function_pointers()` to resolve calls to function pointers to direct
function calls. The pass is needed by `cbmc`, as symbolic execution itself can't
handle calls to function pointers. In practice, the transformation pass works as
follows:

Given that there are functions with these signatures available in the program:

```
int f(int x);
int g(int x);
int h(int x);
```

And we have a call site like this:

```
typedef int(*fptr_t)(int x);
void call(fptr_t fptr) {
int r = fptr(10);
assert(r > 0);
}
```

Function pointer removal will turn this into code similar to this:

```
void call(fptr_t fptr) {
int r;
if(fptr == &f) {
r = f(10);
} else if(fptr == &g) {
r = g(10);
} else if(fptr == &h) {
r = h(10);
} else {
// sanity check
assert(false);
assume(false);
}
return r;
}
```

This works well enough for simple cases. However, this is a very simple
replacement only based on the signature of the function (and whether or not they
have their address taken somewhere in the program), so if there are many
functions matching a particular signature, or if some of these functions are
expensive in symex (e.g. functions with lots of loops or recursion), then this
can be a bit cumbersome - especially if we, as a user, already know that a
particular function pointer will only resolve to a single function or a small
set of functions. The `goto-instrument` option `--restrict-function-pointer`
allows to manually specify this set of functions.

### Example

Take the motivating example above. Let us assume that we know for a fact that
`call` will always receive pointers to either `f` or `g` during actual
executions of the program, and symbolic execution for `h` is too expensive to
simply ignore the cost of its branch. For this, we will label the places in each
function where function pointers are being called, to this pattern:

```
<function-name>.function_pointer_call.<N>
```
where `N` is referring to which function call it is - so the first call to a
function pointer in a function will have `N=1`, the 5th `N=5` etc, or
```
<function-name>.<label>
```
when the function call is labelled.

We can call `goto-instrument --restrict-function-pointer
call.function_pointer_call.1/f,g in.gb out.gb`. This can be read as

> For the first call to a function pointer in the function `call`, assume that
> it can only be a call to `f` or `g`

The resulting output (written to goto binary `out.gb`) looks similar to the
original example, except now there will not be a call to `h`:

```
void call(fptr_t fptr) {
int r;
if(fptr == &f) {
r = f(10);
} else if(fptr == &g) {
r = g(10);
} else {
// sanity check
assert(false);
assume(false);
}
return r;
}
```

Another example: Imagine we have a simple virtual filesystem API and implementation
like this:

```
typedef struct filesystem_t filesystem_t;
struct filesystem_t {
int (*open)(filesystem_t *filesystem, const char* file_name);
};

int fs_open(filesystem_t *filesystem, const char* file_name) {
filesystem->open(filesystem, file_name);
}

int nullfs_open(filesystem_t *filesystem, const char* file_name) {
return -1;
}

filesystem_t nullfs_val = {.open = nullfs_open};
filesystem *const nullfs = &nullfs_val;

filesystem_t *get_fs_impl() {
// some fancy logic to determine
// which filesystem we're getting -
// in-memory, backed by a database, OS file system
// - but in our case, we know that
// it always ends up being nullfs
// for the cases we care about
return nullfs;
}
int main(void) {
filesystem_t *fs = get_fs_impl();
assert(fs_open(fs, "hello.txt") != -1);
}
```

In this case, the assumption is that *we* know that in our `main`, `fs` can be
nothing other than `nullfs`; But perhaps due to the logic being too complicated
symex ends up being unable to figure this out, so in the call to `fs_open()` we
end up branching on all functions matching the signature of
`filesystem_t::open`, which could be quite a few functions within the program.
Worst of all, if its address is ever taken in the program, as far as the "dumb"
function pointer removal is concerned it could be `fs_open()` itself due to it
having a matching signature, leading to symex being forced to follow a
potentially infinite recursion until its unwind limit.

In this case we can again restrict the function pointer to the value which we
know it will have:

```
--restrict-function-pointer fs_open.function_pointer_call.1/nullfs_open
```

### Loading from file

If you have many places where you want to restrict function pointers, it'd be a
nuisance to have to specify them all on the command line. In these cases, you
can specify a file to load the restrictions from instead, via the
`--function-pointer-restrictions-file` option, which you can give the name of a
JSON file with this format:

```
{
"function_call_site_name": ["function1", "function2", ...],
...
}
```

**Note:** If you pass in multiple files, or a mix of files and command line
restrictions, the final restrictions will be a set union of all specified
restrictions.

**Note:** as of now, if something goes wrong during type checking (i.e. making
sure that all function pointer replacements refer to functions in the symbol
table that have the correct type), the error message will refer to the command
line option `--restrict-function-pointer` regardless of whether the restriction
in question came from the command line or a file.