diffblue · NlightNFotis · Sep 26, 2022 · Sep 22, 2022 · Sep 22, 2022 · Sep 22, 2022
@@ -1,4 +1,5 @@
 \ingroup module_hidden 
+
 \page background-concepts Background Concepts
 
 \author Martin Brain, Peter Schrammel, Johannes Kloos

@@ -1,4 +1,5 @@
 \ingroup module_hidden
+
 \page cbmc-architecture CBMC Architecture
 
 \author Martin Brain, Peter Schrammel

@@ -1,4 +1,5 @@
 \ingroup module_hidden 
+
 \page code-walkthrough Code Walkthrough
 
 \author Cesar Rodriguez, Owen Jones

@@ -1,4 +1,5 @@
 \ingroup module_hidden 
+
 \page compilation-and-development Compilation and Development
 
 \author Martin Brain, Peter Schrammel, Owen Jones

@@ -1,4 +1,5 @@
 \ingroup module_hidden 
+
 \page folder-walkthrough Folder Walkthrough
 
 \author Martin Brain, Peter Schrammel

@@ -1,5 +1,5 @@
-CProver Developer Documentation
-=====================
+
+\page cprover_documentation CProver documentation
 
 These pages contain user tutorials, automatically-generated API
 documentation, and higher-level architectural overviews for the
@@ -66,6 +66,12 @@ Overview of Documentation
 you can access it <a href=
 "https://svn.cprover.org/wiki/doku.php?id=cprover_tutorial">here</a>.
 
+* \subpage memory-analyzer
+* \subpage memory-bounds-checking
+* \subpage nondet-volatile
+* \subpage restricting-function-pointers
+* \subpage satabs
+
 ### For contributors:
 
 The following pages attempt to provide the information that a developer needs to

@@ -1,4 +1,5 @@
 \ingroup module_hidden
+
 \page tutorial Tutorials
 
 \section cbmc_tutorial CBMC Developer Tutorial

@@ -1,6 +1,5 @@
-[CPROVER Manual TOC](../../)
 
-## Memory Analyzer
+\page memory-analyzer Memory Analyzer
 
 The memory analyzer is a front-end that runs and queries GDB in order to obtain
 a snapshot of the state of the input program at a particular program location.

@@ -1,4 +1,5 @@
 \ingroup module_hidden
+
 \page memory-bounds-checking Memory Bounds Checking
 
 cbmc provides means to verify that pointers are within the bounds of (statically

diff --git a/doc/architectural/nondet-volatile.md b/doc/architectural/nondet-volatile.md
@@ -0,0 +1,136 @@
+\ingroup module_hidden 
+
+\page nondet-volatile Modelling of Volatile Variables
+
+This document describes the options available in goto-instrument for modelling
+the behaviour of volatile variables. The following options are provided:
+
+- `--nondet-volatile`
+- `--nondet-volatile-variable <variable>`
+- `--nondet-volatile-model <variable>:<model>`
+
+By default, cbmc treats volatile variables the same as non-volatile variables.
+That is, it assumes that a volatile variable does not change between subsequent
+reads, unless it was written to by the program. With the above options,
+goto-instrument can be instructed to instrument the given goto program such as
+to (1) make reads from all volatile expressions non-deterministic
+(`--nondet-volatile`), (2) make reads from specific variables non-deterministic
+(`--nondet-volatile-variable`), or (3) model reads from specific variables by
+given models (`--nondet-volatile-model`).
+
+Below we give two usage examples for the options. Consider the following test,
+for function `get_celsius()` and with harness `test_get_celsius()`:
+
+```C
+#include <assert.h>
+#include <limits.h>
+#include <stdint.h>
+
+// hardware sensor for temperature in kelvin
+extern volatile uint16_t temperature;
+
+int get_celsius()
+{
+  if(temperature > (1000 + 273))
+  {
+    return INT_MIN; // value indicating error
+  }
+
+  return temperature - 273;
+}
+
+void test_get_celsius()
+{
+  int t = get_celsius();
+
+  assert(t == INT_MIN || t <= 1000);
+  assert(t == INT_MIN || t >= -273);
+}
+```
+
+Here the variable `temperature` corresponds to a hardware sensor. It returns
+the current temperature on each read. The `get_celsius()` function converts the
+value in Kelvin to degrees Celsius, given the value is in the expected range.
+However, it has a bug where it reads `temperature` a second time after the
+check, which may yield a value for which the check would not succeed. Verifying
+this program as is with cbmc would yield a verification success. We can use
+goto-instrument to make reads from `temperature` non-deterministic:
+
+```
+goto-cc -o get_celsius_test.gb get_celsius_test.c
+goto-instrument --nondet-volatile-variable temperature get_celsius_test.gb get_celsius_test-mod.gb
+cbmc --function test_get_celsius get_celsius_test-mod.gb
+```
+
+Here the final invocation of cbmc correctly reports a verification failure.
+
+However, simply treating volatile variables as non-deterministic may for some
+use cases be too inaccurate. Consider the following test, for function
+`get_message()` and with harness `test_get_message()`:
+
+```C
+#include <assert.h>
+#include <stdint.h>
+
+extern volatile uint32_t clock;
+
+typedef struct message {
+  uint32_t timestamp;
+  void *data;
+} message_t;
+
+void *read_data();
+
+message_t get_message()
+{
+  message_t msg;
+
+  msg.timestamp = clock;
+  msg.data = read_data();
+
+  return msg;
+}
+
+void test_get_message()
+{
+  message_t msg1 = get_message();
+  message_t msg2 = get_message();
+
+  assert(msg1.timestamp <= msg2.timestamp);
+}
+```
+
+The harness verifies that `get_message()` assigns non-decreasing timestamps to
+the returned messages. However, simply treating `clock` as non-deterministic
+would not allow to prove this property. Thus, we can supply a model for reads
+from `clock`:
+
+```C
+// model for reads of the variable clock
+uint32_t clock_read_model()
+{
+  static uint32_t clock_value = 0;
+
+  uint32_t increment;
+  __CPROVER_assume(increment <= 100);
+
+  clock_value += increment;
+
+  return clock_value;
+}
+```
+
+The model is stateful in that it keeps the current clock value between
+invocations in the variable `clock_value`. On each invocation, it increments
+the clock by a non-determinstic value in the range 0 to 100. We can tell
+goto-instrument to use the model `clock_read_model()` for reads from the
+variable `clock` as follows:
+
+```
+goto-cc -o get_message_test.gb get_message_test.c
+goto-instrument --nondet-volatile-model clock:clock_read_model get_message_test.gb get_message_test-mod.gb
+cbmc --function get_message_test get_message_test-mod.gb
+```
+
+Now the final invocation of cbmc reports verification success.
+
@@ -1,4 +1,5 @@
 \ingroup module_hidden 
+
 \page other-tools Other Tools
 
 \author Martin Brain, Peter Schrammel

diff --git a/doc/architectural/restrict-function-pointer.md b/doc/architectural/restrict-function-pointer.md
@@ -0,0 +1,182 @@
+
+\page restricting-function-pointers Restricting function pointers
+
+In this document, we describe the `goto-instrument` feature to replace calls
+through function pointers by case distinctions over calls to given sets of
+functions.
+
+### Motivation
+
+The CPROVER framework includes a goto program transformation pass
+`remove_function_pointers()` to resolve calls to function pointers to direct
+function calls. The pass is needed by `cbmc`, as symbolic execution itself can't
+handle calls to function pointers. In practice, the transformation pass works as
+follows:
+
+Given that there are functions with these signatures available in the program:
+
+```
+int f(int x);
+int g(int x);
+int h(int x);
+```
+
+And we have a call site like this:
+
+```
+typedef int(*fptr_t)(int x);
+void call(fptr_t fptr) {
+  int r = fptr(10);
+  assert(r > 0);
+}
+```
+
+Function pointer removal will turn this into code similar to this:
+
+```
+void call(fptr_t fptr) {
+  int r;
+  if(fptr == &f) {
+    r = f(10);
+  } else if(fptr == &g) {
+    r = g(10);
+  } else if(fptr == &h) {
+    r = h(10);
+  } else {
+    // sanity check
+    assert(false);
+    assume(false);
+  }
+  return r;
+}
+```
+
+This works well enough for simple cases. However, this is a very simple
+replacement only based on the signature of the function (and whether or not they
+have their address taken somewhere in the program), so if there are many
+functions matching a particular signature, or if some of these functions are
+expensive in symex (e.g. functions with lots of loops or recursion), then this
+can be a bit cumbersome - especially if we, as a user, already know that a
+particular function pointer will only resolve to a single function or a small
+set of functions. The `goto-instrument` option `--restrict-function-pointer`
+allows to manually specify this set of functions.
+
+### Example
+
+Take the motivating example above. Let us assume that we know for a fact that
+`call` will always receive pointers to either `f` or `g` during actual
+executions of the program, and symbolic execution for `h` is too expensive to
+simply ignore the cost of its branch. For this, we will label the places in each
+function where function pointers are being called, to this pattern:
+
+```
+<function-name>.function_pointer_call.<N>
+```
+where `N` is referring to which function call it is - so the first call to a
+function pointer in a function will have `N=1`, the 5th `N=5` etc, or
+```
+<function-name>.<label>
+```
+when the function call is labelled.
+
+We can call `goto-instrument --restrict-function-pointer
+call.function_pointer_call.1/f,g in.gb out.gb`. This can be read as
+
+> For the first call to a function pointer in the function `call`, assume that
+> it can only be a call to `f` or `g`
+
+The resulting output (written to goto binary `out.gb`) looks similar to the
+original example, except now there will not be a call to `h`:
+
+```
+void call(fptr_t fptr) {
+  int r;
+  if(fptr == &f) {
+    r = f(10);
+  } else if(fptr == &g) {
+    r = g(10);
+  } else {
+    // sanity check
+    assert(false);
+    assume(false);
+  }
+  return r;
+}
+```
+
+Another example: Imagine we have a simple virtual filesystem API and implementation
+like this:
+
+```
+typedef struct filesystem_t filesystem_t;
+struct filesystem_t {
+  int (*open)(filesystem_t *filesystem, const char* file_name);
+};
+
+int fs_open(filesystem_t *filesystem, const char* file_name) {
+  filesystem->open(filesystem, file_name);
+}
+
+int nullfs_open(filesystem_t *filesystem, const char* file_name) {
+  return -1;
+}
+
+filesystem_t nullfs_val = {.open = nullfs_open};
+filesystem *const nullfs = &nullfs_val;
+
+filesystem_t *get_fs_impl() {
+  // some fancy logic to determine
+  // which filesystem we're getting -
+  // in-memory, backed by a database, OS file system
+  // - but in our case, we know that
+  // it always ends up being nullfs
+  // for the cases we care about
+  return nullfs;
+}
+int main(void) {
+  filesystem_t *fs = get_fs_impl();
+  assert(fs_open(fs, "hello.txt") != -1);
+}
+```
+
+In this case, the assumption is that *we* know that in our `main`, `fs` can be
+nothing other than `nullfs`; But perhaps due to the logic being too complicated
+symex ends up being unable to figure this out, so in the call to `fs_open()` we
+end up branching on all functions matching the signature of
+`filesystem_t::open`, which could be quite a few functions within the program.
+Worst of all, if its address is ever taken in the program, as far as the "dumb"
+function pointer removal is concerned it could be `fs_open()` itself due to it
+having a matching signature, leading to symex being forced to follow a
+potentially infinite recursion until its unwind limit.
+
+In this case we can again restrict the function pointer to the value which we
+know it will have:
+
+```
+--restrict-function-pointer fs_open.function_pointer_call.1/nullfs_open
+```
+
+### Loading from file
+
+If you have many places where you want to restrict function pointers, it'd be a
+nuisance to have to specify them all on the command line. In these cases, you
+can specify a file to load the restrictions from instead, via the
+`--function-pointer-restrictions-file` option, which you can give the name of a
+JSON file with this format:
+
+```
+{
+  "function_call_site_name": ["function1", "function2", ...],
+   ...
+}
+```
+
+**Note:** If you pass in multiple files, or a mix of files and command line
+restrictions, the final restrictions will be a set union of all specified
+restrictions.
+
+**Note:** as of now, if something goes wrong during type checking (i.e. making
+sure that all function pointer replacements refer to functions in the symbol
+table that have the correct type), the error message will refer to the command
+line option `--restrict-function-pointer` regardless of whether the restriction
+in question came from the command line or a file.