Skip to content

Commit 7233f92

Browse files
author
Owen Jones
committed
Rearrange everything into separate pages
1 parent 1cb3cdd commit 7233f92

13 files changed

+442
-438
lines changed

doc/architectural/CBMC-developer-guide.md

Lines changed: 0 additions & 432 deletions
This file was deleted.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
\ingroup module_hidden
2+
\page background-concepts Background Concepts
3+
4+
\author Martin Brain, Peter Schrammel
5+
6+
## AST: types, globals, variables, functions, code blocks, language primitives, assignments, expressions, variables ##
7+
8+
To be documented.
9+
10+
## CFG ##
11+
12+
To be documented.
13+
14+
## Bounded model checking (from the CBMC manual) ##
15+
16+
To be documented.
17+
18+
### SSA ###
19+
20+
To be documented.
21+
22+
## SAT and SMT ##
23+
24+
To be documented.
25+
26+
## Static analysis ##
27+
28+
To be documented.

doc/architectural/bmct-class.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
\ingroup module_hidden
2+
\page bmct-class Bmct class
3+
4+
\author
5+
6+
## equation ##
7+
8+
To be documented.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
\ingroup module_hidden
2+
\page compilation-and-development Compilation and Development
3+
4+
\author Martin Brain, Peter Schrammel
5+
6+
# Compilation and Development
7+
8+
## Makefiles ##
9+
10+
First off, read the \ref cbmc-user-manual "CBMC User Manual". It describes
11+
how to get, build and use CBMC. This document covers the
12+
internals of the system and how to get started on development.
13+
14+
## CMake files ##
15+
16+
To be documented.
17+
18+
## Personal configuration: config.inc, macro DEBUG ##
19+
20+
To be documented.
21+
22+
## Generating doxygen documentation ##
23+
24+
Run `doxygen` in `/src`. HTML output will be created in `/doc/html`. The
25+
index page is `/doc/html/index.html`.
26+
27+
Apart from the (user-orientated) CBMC user manual and this document, most
28+
of the rest of the documentation is inline in the code as `doxygen` and
29+
some comments. A man page for CBMC, goto-cc and goto-instrument is
30+
contained in the `doc/` directory and gives some options for these
31+
tools. All of these could be improved and patches are very welcome. In
32+
some cases the algorithms used are described in the relevant papers.
33+
34+
## Coding standards ##
35+
36+
See <a
37+
href="https://github.com/diffblue/cbmc/blob/master/CODING_STANDARD.md">
38+
`CODING_STANDARD.md`</a> file in the root of the CBMC repository.
39+
40+
CPROVER is written in a fairly minimalist subset of C++; templates and
41+
meta-programming are avoided except where necessary.
42+
External library dependencies are avoided (only STL and a SAT solver
43+
are required). Boost is not used. The `util` directory contains many
44+
utilities that are not (yet) in the standard library.
45+
46+
Patches should be formatted so that code is indented with two space
47+
characters, not tab and wrapped to 80 columns. Headers for doxygen
48+
should be given (and preferably filled!) and the author will be the
49+
person who first created the file. Add doxygen comments to
50+
undocumented functions as you touch them. Coding standards
51+
and doxygen comments are enforced by CI before a patch can be
52+
merged by running `clang-format` and `cpplint`.
53+
54+
Identifiers should be lower case with underscores to separate words.
55+
Types (classes, structures and typedefs) names must end with a `t`.
56+
Types that model types (i.e. C types in the program that is being
57+
interpreted) are named with `_typet`. For example `ui_message_handlert`
58+
rather than `UI_message_handlert` or `UIMessageHandler` and
59+
`union_typet`.
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
\ingroup module_hidden
2+
\page cprover-architecture CPROVER Architecture
3+
4+
\author Martin Brain, Peter Schrammel
5+
6+
This section provides a graphical overview of how CBMC fits together.
7+
CBMC takes C code or a goto-binary as input and tries to emit traces
8+
of executions that lead to crashes or undefined behaviour. The diagram
9+
below shows the intermediate steps in this process.
10+
11+
\dot
12+
digraph G {
13+
14+
rankdir="TB";
15+
node [shape=box, fontcolor=blue];
16+
17+
subgraph top {
18+
rank=same;
19+
1 -> 2 -> 3 -> 4;
20+
}
21+
22+
subgraph bottom {
23+
rank=same;
24+
5 -> 6 -> 7 -> 8 -> 9;
25+
}
26+
27+
/* shift bottom subgraph over */
28+
9 -> 1 [color=white];
29+
30+
4 -> 5;
31+
32+
1 [label="command line\nparsing" URL="\ref cbmc_parse_optionst"];
33+
2 [label="preprocessing,\nparsing" URL="\ref preprocessing"];
34+
3 [label="language\ntype-checking" URL="\ref type-checking"];
35+
4 [label="goto\nconversion" URL="\ref goto-conversion"];
36+
5 [label="instrumentation" URL="\ref instrumentation"];
37+
6 [label="symbolic\nexecution" URL="\ref symbolic-execution"];
38+
7 [label="SAT/SMT\nencoding" URL="\ref sat-smt-encoding"];
39+
8 [label="decision\nprocedure" URL="\ref decision-procedure"];
40+
9 [label="counter example\nproduction" URL="\ref counter-example-production"];
41+
}
42+
\enddot
43+
44+
The \ref cbmc-user-manual "CBMC User Manual" describes CBMC from a user
45+
perspective. Each node in the diagram above links to the appropriate
46+
class or module documentation, describing that particular stage in the
47+
CBMC pipeline.
48+
CPROVER is structured in a similar fashion to a compiler. It has
49+
language specific front-ends which perform limited syntactic analysis
50+
and then convert to an intermediate format. The intermediate format can
51+
be output to files (this is what `goto-cc` does) and are (informally)
52+
referred to as “goto binaries” or “goto programs”. The back-end are
53+
tools process this format, either directly from the front-end or from
54+
it’s saved output. These include a wide range of analysis and
55+
transformation tools (see \ref section-other-tools).
56+
57+
# Concepts #
58+
## {C, java bytecode} → Parse tree → Symbol table → GOTO programs → GOTO program transformations → BMC → counterexample (goto_tracet) → printing ##
59+
60+
To be documented.
61+
62+
## Instrumentation (for test gen & CBMC & ...): Goto functions -> goto functions ##
63+
64+
To be documented.
65+
66+
## Goto functions -> BMC -> Counterexample (trace) ##
67+
68+
To be documented.
69+
70+
## Trace -> interpreter -> memory map ##
71+
72+
To be documented.
73+
74+
## Goto functions -> abstract interpretation ##
75+
76+
To be documented.
77+
78+
## Executables (flow of transformations): ##
79+
80+
To be documented.
81+
82+
### goto-cc ###
83+
84+
To be documented.
85+
86+
### goto-instrument ###
87+
88+
To be documented.
89+
90+
### cbmc ###
91+
92+
To be documented.
93+
94+
### goto-analyzer ###
95+
96+
To be documented.
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
\ingroup module_hidden
2+
\page data-structures-core-structures-and-ast Data structures: core structures and AST
3+
4+
\author Martin Brain, Peter Schrammel, Owen Jones
5+
6+
## Strings: dstringt, the string_container and the ID_* ##
7+
Within cbmc, strings are represented using `irep_idt`. By default this is
8+
typedefed to \ref dstringt, which stores a string as an index into a large
9+
static table of strings. This makes it easy to compare if two `irep_idt`s
10+
are equal (just compare the index) and it makes it efficient to store
11+
many copies of the same string. The static list of strings is initially populated
12+
from `irep_ids.def`, so for example the fourth entry in `irep_ids.def` is
13+
“IREP_ID_ONE(type)”, so the string “type” has index 3. You can refer to
14+
this `irep_idt` as `ID_type`. The other kind of line you see is
15+
“IREP_ID_TWO(C_source_location, #source_location)”, which means the
16+
`irep_idt` for the string “#source_location” can be referred to as
17+
`ID_C_source_location`. The “C” is for comment, which will be explained
18+
in the next section. Any strings that need to be stored as `irep_id`s which
19+
aren't in `irep_ids.def` are added to the end of the table when they are
20+
first encountered, and the same index is used for all instances.
21+
22+
See documentation at \ref dstringt.
23+
24+
## irept: a 4-triple (data, named-sub, comments, sub) ##
25+
See documentation at \ref irept.
26+
27+
As that documentation says, `irept`s are generic tree nodes. You should
28+
think of them as having a single string (data, actually an `irep_idt`) and
29+
lots of child nodes, some of which are numbered (sub) and some of which are
30+
labelled, and the label can either start with a “#” (comments-sub) or without
31+
one (named-sub). The meaning of the “#” is that this child should not be
32+
considered important, for example it shouldn’t be counted when comparing two
33+
`irept`s for equality.
34+
35+
## typet ##
36+
37+
To be documented.
38+
39+
### symbol_typet ###
40+
41+
To be documented.
42+
43+
## exprt ##
44+
\ref exprt is the class to represent an expression. It inherits from \ref irept,
45+
and the only things it adds to it are that every \ref exprt has a named sub
46+
containing its type and everything in the sub of an \ref exprt is again an
47+
\ref exprt, not just an \ref irept. You can think of \ref exprt as a node in the
48+
abstract syntax tree for an expression. There are many classes that
49+
inherit from \ref exprt and which represent more specific things. For
50+
example, there is \ref minus_exprt, which has a sub of size 2 (for the two
51+
argument of minus).
52+
53+
Recall that every \ref irept has one piece of data of its own, i.e. its
54+
`id()`, and all other information is in its `named_sub`, `comments` or
55+
`sub`. For `exprt`s, the `id()` is used to specify why kind of \ref exprt this is,
56+
so a \ref minus_exprt has `ID_minus` as its `id()`. This means that a
57+
\ref minus_exprt can be passed wherever an \ref exprt is expected, and if you
58+
want to check if the expression you are looking at is a minus
59+
expression then you have to check its `id()` (or use
60+
`can_cast_expr<minus_exprt>`).
61+
62+
## codet ##
63+
\ref exprt represents expressions and \ref codet represents statements. \ref codet
64+
inherits from \ref exprt, so all `codet`s are `exprt`s, with `id()` `ID_code`.
65+
Many different kinds of statements inherit from \ref codet, and they are
66+
distinguished by their `statement()`. For example, a while loop would be
67+
represented by a \ref code_whilet, which has `statement()` `ID_while`.
68+
\ref code_whilet has two operands in its sub, and helper functions to make
69+
it easier to use: `cond()` returns the first subexpression, which
70+
is the condition for the while loop, and `body()` returns the second
71+
subexpression, which is the body of the while loop - this has to be
72+
a \ref codet, because the body of a while loop is a statement.
73+
74+
## symbolt, symbol_table, and namespacet ##
75+
76+
To be documented.
77+
78+
### Symbol lifetimes, symbol modes, name, base-name, pretty-name; semantic of lifetimes for symex? ###
79+
80+
To be documented.
81+
82+
### Storing symbols and hiding symbols (namespacet) ###
83+
84+
To be documented.
85+
86+
### ns.follow ##
87+
88+
To be documented.
89+
90+
91+
## Examples: how to represent the AST of statements, focus on java ##
92+
93+
To be documented.
94+
95+
### struct Array { int length, int *data }; ###
96+
97+
To be documented.
98+
99+
### x = y + 123 ###
100+
101+
To be documented.
102+
103+
### if (x > 10) { y = 2 } else { v[3] = 4 } ###
104+
105+
To be documented.
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
\ingroup module_hidden
2+
\page data-structures-from-ast-to-goto-program Data structures: from AST to GOTO program
3+
4+
\author Martin Brain, Peter Schrammel
5+
6+
## goto_programt ##
7+
8+
To be documented.
9+
10+
### The CFG of a function ###
11+
12+
To be documented.
13+
14+
### instructiont ###
15+
16+
See documentation at \ref instructiont.
17+
18+
#### Types, motivation of each type (dead?) #####
19+
20+
To be documented.
21+
22+
#### Accepted code (codet) values ####
23+
24+
To be documented.
25+
26+
#### Accepted guard (exprt) values ####
27+
28+
To be documented.
29+
30+
## goto_functionst ##
31+
32+
To be documented.
33+
34+
### A map from function names to function bodies (CFGs) ###
35+
36+
To be documented.
37+
38+
## goto_modelt ##
39+
40+
To be documented.
41+
42+
### A compilation unit ###
43+
44+
To be documented.
45+
46+
## Example: ##
47+
48+
To be documented.
49+
50+
### Unsigned mult (unsigned a, unsigned b) { int acc, i; for (i = 0; i < b; i++) acc += a; return acc; } ###
51+
52+
To be documented.
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
\ingroup module_hidden
2+
\page front-end-languages-generating-codet-from-multiple-languages Front-end languages: generating codet from multiple languages
3+
4+
\author Martin Brain, Peter Schrammel
5+
6+
## Front-end languages: generating codet from multiple languages ##
7+
8+
To be documented.
9+
10+
### language_uit, language_filest, languaget classes: ###
11+
12+
To be documented.
13+
14+
### Purpose ###
15+
16+
To be documented.
17+
18+
### Parse ###
19+
20+
To be documented.
21+
22+
### Typecheck ###
23+
24+
To be documented.
25+
26+
### Final ###
27+
28+
To be documented.
29+
30+
## Java bytecode: ##
31+
32+
To be documented.
33+
34+
### Explain how a java program / class is represented in a .class ###
35+
36+
To be documented.
37+
38+
### Explain the 2 step conversion from bytecode to codet; give an example with a class? ###
39+
40+
To be documented.

0 commit comments

Comments
 (0)