Skip to content

Commit d67fa60

Browse files
committed
[path explore 7/7] Path exploration documentation
1 parent 53df567 commit d67fa60

File tree

3 files changed

+122
-0
lines changed

3 files changed

+122
-0
lines changed

src/cbmc/bmc.h

+5
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,11 @@ Author: Daniel Kroening, [email protected]
3939

4040
class cbmc_solverst;
4141

42+
/// \brief Bounded model checking or path exploration for goto-programs
43+
///
44+
/// Higher-level architectural information on symbolic execution is
45+
/// documented in the \ref symex-overview
46+
/// "Symbolic execution module page".
4247
class bmct:public safety_checkert
4348
{
4449
public:

src/goto-symex/goto_symex.h

+4
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,10 @@ class side_effect_exprt;
3838
class typecast_exprt;
3939

4040
/// \brief The main class for the forward symbolic simulator
41+
///
42+
/// Higher-level architectural information on symbolic execution is
43+
/// documented in the \ref symex-overview
44+
/// "Symbolic execution module page".
4145
class goto_symext
4246
{
4347
public:

src/goto-symex/module.md

+113
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@
33

44
\author Kareem Khazem
55

6+
\section symbolic-execution Symbolic Execution
7+
8+
In the \ref goto-symex directory.
9+
610
**Key classes:**
711
* goto_symex_statet
812
* goto_symext
@@ -19,6 +23,115 @@ digraph G {
1923
}
2024
\enddot
2125

26+
\subsection symex-overview Overview
27+
28+
The \ref bmct class gets a reference to an \ref symex_target_equationt
29+
"equation" (initially, an empty list of \ref symex_target_equationt::SSA_stept
30+
"single-static assignment steps") and a goto-program from the frontend.
31+
The \ref bmct creates a goto_symext to symbolically execute the
32+
goto-program, thereby filling the equation, which can then be passed
33+
along to the SAT solver.
34+
35+
The symbolic execution state is maintained by goto_symex_statet. This is
36+
a memory-heavy data structure, so goto_symext creates it on-demand and
37+
lets it go out-of-scope as soon as possible. However, the process of
38+
symbolic execution fills out an additional \ref symbol_tablet
39+
"symbol table" of dynamically-created objects; this symbol table is
40+
needed when solving the equation. This symbol table must thus be
41+
exported out of the state before it is torn down; this is done through
42+
the parameter "`new_symbol_table`" passed as a reference to the various
43+
functions in goto_symext.
44+
45+
The main symbolic execution loop code is goto_symext::symex_step(). This
46+
function case-switches over the type of the instruction that we're
47+
currently executing, and calls various other functions appropriate to
48+
the instruction type, i.e. goto_symext::symex_function_call() if the
49+
current instruction is a function call, goto_symext::symex_goto() if the
50+
current instruction is a goto, etc.
51+
52+
\subsection symex-path Path exploration
53+
54+
By default, CBMC symbolically executes the entire program, building up
55+
an equation representing all instructions, which it then passes to the
56+
solver. Notably, it _merges_ paths that diverge due to a goto
57+
instruction, forming a constraint that represents having taken either of
58+
the two paths; the code for doing this is goto_symext::merge_goto().
59+
60+
CBMC can operate in a different mode when the `--paths` flag is passed
61+
on the command line. This disables path merging; instead, CBMC executes
62+
a single path at a time, calling the solver with the equation
63+
representing that path, then continues to execute other paths.
64+
The 'other paths' that would have been taken when the program branches
65+
are saved onto a workqueue so that CBMC can continue to execute the
66+
current path, and then later retrieve saved paths from the workqueue.
67+
Implementation-wise, \ref bmct passes a worklist to goto_symext in
68+
bmct::do_language_agnostic_bmc(). If path exploration is enabled,
69+
goto_symext will fill up the worklist whenever it encounters a branch,
70+
instead of merging the paths on the branch. After the initial symbolic
71+
execution (i.e. the first call to bmct::run() in
72+
bmct::do_language_agnostic_bmc()), \ref bmct continues popping the
73+
worklist and executing untaken paths until the worklist empties. Note
74+
that this means that the default model-checking behaviour is a special
75+
case of path exploration: when model-checking, the initial symbolic
76+
execution run does not add any paths to the workqueue but rather merges
77+
all the paths together, so the additional path-exploration loop is
78+
skipped over.
79+
80+
\subsection ssa-renaming SSA renaming levels
81+
82+
In goto-programs, variable names get a prefix to indicate their scope
83+
(like `main::1::%foo` or whatever). At symbolic execution level, variables
84+
also get a _suffix_ because we’re doing single-static assignment. There
85+
are three “levels” of renaming. At Level 2, variables are renamed every
86+
time they are encountered in a function. At Level 1, variables are
87+
renamed every time the functions that contain those variables are
88+
called. At Level 0, variables are renamed every time a new thread
89+
containing those variables are spawned. We can inspect the renamed
90+
variable names with the –show-vcc flag, which prints a string with the
91+
following format: `%%s!%%d@%%d#%%d`. The string field is the variable name,
92+
and the numbers after the !, @, and # are the L0, L1, and L2 suffixes
93+
respectively. The following examples illustrate Level 1 and 2 renaming:
94+
95+
$ cat l1.c
96+
int main() {
97+
int x=7;
98+
x=8;
99+
assert(0);
100+
}
101+
$ cbmc --show-vcc l1.c
102+
...
103+
{-12} x!0@1#2 == 7
104+
{-13} x!0@1#3 == 8
105+
106+
That is, the L0 names for both xs are 0; the L1 names for both xs are 1;
107+
but each occurrence of x within main() gets a fresh L2 suffix (2 and 3,
108+
respectively).
109+
110+
$ cat l2.c
111+
void foo(){
112+
int x=7;
113+
x=8;
114+
x=9;
115+
}
116+
int main(){
117+
foo();
118+
foo();
119+
assert(0);
120+
}
121+
$ cbmc --show-vcc l2.c
122+
...
123+
{-12} x!0@1#2 == 7
124+
{-13} x!0@1#3 == 8
125+
{-14} x!0@1#4 == 9
126+
{-15} x!0@2#2 == 7
127+
{-16} x!0@2#3 == 8
128+
{-17} x!0@2#4 == 9
129+
130+
That is, every time we enter function foo, the L1 counter of x is
131+
incremented (from 1 to 2) and the L0 counter is reset (back to 2, after
132+
having been incremented up to 4). The L0 counter then increases every
133+
time we access x as we walk through the function.
134+
22135
---
23136
\section counter-example-production Counter Example Production
24137

0 commit comments

Comments
 (0)