Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 472f0ac

Browse files
committedMar 22, 2018
Populating module documentation md files
by distributiong M. Brain's brief module descriptions from the old wiki
1 parent 2154f54 commit 472f0ac

File tree

24 files changed

+423
-544
lines changed

24 files changed

+423
-544
lines changed
 

‎doc/architectural/cprover-architecture-overview.md

Lines changed: 51 additions & 497 deletions
Large diffs are not rendered by default.

‎src/analyses/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
\ingroup module_hidden
22
\defgroup analyses analyses
3+
34
# Folder analyses
45

6+
This contains the abstract interpretation framework `ai.h` and several
7+
static analyses that instantiate it.
8+
59
FIXME: put here a good introduction describing what is contained
610
in this folder.

‎src/ansi-c/README.md

Lines changed: 41 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,26 @@
22
\defgroup ansi-c ansi-c
33
# Folder ansi-c
44

5-
\author Kareem Khazem
5+
\author Kareem Khazem, Martin Brain
66

7-
CodeWarrior C Compilers Reference 3.2:
8-
9-
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/CCOMPILERRM.pdf
7+
\section overview Overview
108

11-
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/ASMX86RM.pdf
9+
Contains the front-end for ANSI C, plus a variety of common extensions.
10+
This parses the file, performs some basic sanity checks (this is one
11+
area in which the UI could be improved; patches most welcome) and then
12+
produces a goto-program (see below). The parser is a traditional Flex /
13+
Bison system.
1214

13-
ARM 4.1 Compiler Reference:
15+
`internal_addition.c` contains the implementation of various ‘magic’
16+
functions that are that allow control of the analysis from the source
17+
code level. These include assertions, assumptions, atomic blocks, memory
18+
fences and rounding modes.
1419

15-
http://infocenter.arm.com/help/topic/com.arm.doc.dui0491c/DUI0491C_arm_compiler_reference.pdf
16-
17-
18-
Parsing performance considerations:
19-
20-
* Measured on trunk/regression/ansi-c/windows_h_VS_2012/main.i
21-
22-
* 13%: Copying into i_preprocessed
23-
24-
* 5%: ansi_c_parser.read()
25-
26-
* 53%: yyansi_clex()
27-
28-
* 29%: parser (without typechecking)
20+
The `library/` subdirectory contains versions of some of the C standard
21+
header files that make use of the CPROVER built-in functions. This
22+
allows CPROVER programs to be ‘aware’ of the functionality and model it
23+
correctly. Examples include `stdio.c`, `string.c`, `setjmp.c` and
24+
various threading interfaces.
2925

3026
\section preprocessing Preprocessing & Parsing
3127

@@ -48,8 +44,6 @@ digraph G {
4844
\enddot
4945

5046

51-
52-
---
5347
\section type-checking Type-checking
5448

5549
In the \ref ansi-c and \ref java_bytecode directories.
@@ -136,3 +130,28 @@ called symbols. Thus, for example:
136130
parameter and return types of the function. The value of the symbol is
137131
the function's body (a \ref codet), and the symbol is stored in the
138132
symbol table with `foo` as the key.
133+
134+
135+
\section performance Parsing performance considerations
136+
137+
* Measured on trunk/regression/ansi-c/windows_h_VS_2012/main.i
138+
139+
* 13%: Copying into i_preprocessed
140+
141+
* 5%: ansi_c_parser.read()
142+
143+
* 53%: yyansi_clex()
144+
145+
* 29%: parser (without typechecking)
146+
147+
\section references Compiler References
148+
149+
CodeWarrior C Compilers Reference 3.2:
150+
151+
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/CCOMPILERRM.pdf
152+
153+
http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/ASMX86RM.pdf
154+
155+
ARM 4.1 Compiler Reference:
156+
157+
http://infocenter.arm.com/help/topic/com.arm.doc.dui0491c/DUI0491C_arm_compiler_reference.pdf

‎src/big-int/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
\ingroup module_hidden
22
\defgroup big-int big-int
3+
34
# Folder big-int
45

6+
\author Martin Brain
7+
58
CPROVER is distributed with its own multi-precision arithmetic library;
69
mainly for historical and portability reasons. The library is externally
710
developed and thus `big-int` contains the source as it is distributed.

‎src/cbmc/README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,9 @@
22
\defgroup cbmc cbmc
33
# Folder CBMC
44

5-
The CBMC handles the code related to interacting with CBMC.
5+
This contains the first full application. CBMC is a bounded model
6+
checker that uses the front ends (`ansi-c`, `cpp`, goto-program or
7+
others) to create a goto-program, `goto-symex` to unwind the loops the
8+
given number of times and to produce and equation system and finally
9+
`solvers` to find a counter-example (technically, `goto-symex` is then
10+
used to construct the counter-example trace).

‎src/cpp/README.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
\ingroup module_hidden
22
\defgroup cpp cpp
3+
34
# Folder cpp
45

5-
The C++ Language front-end is for processing C++.
6+
\author Martin Brain
7+
8+
This directory contains the C++ front-end. It supports the subset of C++
9+
commonly found in embedded and system applications. Consequentially it
10+
doesn’t have full support for templates and many of the more advanced
11+
and obscure C++ features. The subset of the language that can be handled
12+
is being extended over time so bug reports of programs that cannot be
13+
parsed are useful.
14+
15+
The functionality is very similar to the ANSI C front end; parsing the
16+
code and converting to goto-programs. It makes use of code from
17+
`langapi` and `ansi-c`.

‎src/goto-analyzer/README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
\ingroup module_hidden
22
\defgroup goto-analyzer goto-analyzer
3+
34
# Folder goto-analyzer
45

5-
`goto-analyzer/` is a module stores information related to interacting with
6-
goto-analyzer. These files are medium risk to change and change frequently.
6+
`goto-analyzer/` is a tool performing static analyses on goto
7+
programs. It provides the front end for many of the static analyses
8+
in the \ref analyses directory.

‎src/goto-cc/README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
\ingroup module_hidden
22
\defgroup goto-cc goto-cc
3+
34
# Folder goto-cc
45

6+
\author Martin Brain
7+
58
`goto-cc` is a compiler replacement that just performs the first step of
69
the process; converting C or C++ programs to goto-binaries. It is
710
intended to be dropped in to an existing build procedure in place of the
@@ -11,3 +14,4 @@ the `goto-cc/` binary. If it is called `goto-cc` then it emulates GCC
1114
flags, `goto-armcc` emulates the ARM compiler, `goto-cl` emulates VCC
1215
and `goto-cw` emulates the Code Warrior compiler. The output of this
1316
tool can then be used with `cbmc` or `goto-instrument`.
17+

‎src/goto-diff/README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
\ingroup module_hidden
22
\defgroup goto-diff goto-diff
3+
34
# Folder goto-diff
45

6+
`goto-diff/` is a tool that offers functionality similar to the `diff`
7+
tool, but for GOTO programs.
8+
59

6-
`goto-diff/` is a module has files which change frequently and are medium
7-
risk.

‎src/goto-instrument/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
11
\ingroup module_hidden
22
\defgroup goto-instrument goto-instrument
3+
34
# Folder goto-instrument
45

6+
\author Martin Brain
7+
58
The `goto-instrument/` directory contains a number of tools, one per
69
file, that are built into the `goto-instrument` program. All of them
710
take in a goto-program (produced by `goto-cc`) and either modify it or

‎src/goto-programs/README.md

Lines changed: 100 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,108 @@
11
\ingroup module_hidden
22
\defgroup goto-programs goto-programs
3+
34
# Folder goto-programs
45

6+
\author Kareem Khazem, Martin Brain
7+
8+
\section overview Overview
9+
Goto programs are the intermediate representation of the CPROVER tool
10+
chain. They are language independent and similar to many of the compiler
11+
intermediate languages. Section \ref goto-programs "goto-programs" describes the
12+
`goto_programt` and `goto_functionst` data structures in detail. However
13+
it useful to understand some of the basic concepts. Each function is a
14+
list of instructions, each of which has a type (one of 18 kinds of
15+
instruction), a code expression, a guard expression and potentially some
16+
targets for the next instruction. They are not natively in static
17+
single-assign (SSA) form. Transitions are nondeterministic (although in
18+
practise the guards on the transitions normally cover form a disjoint
19+
cover of all possibilities). Local variables have non-deterministic
20+
values if they are not initialised. Variables and data within the
21+
program is commonly one of three types (parameterised by width):
22+
`unsignedbv_typet`, `signedbv_typet` and `floatbv_typet`, see
23+
`util/std_types.h` for more information. Goto programs can be serialised
24+
in a binary (wrapped in ELF headers) format or in XML (see the various
25+
`_serialization` files).
26+
27+
The `cbmc` option `–show-goto-programs` is often a good starting point
28+
as it outputs goto-programs in a human readable form. However there are
29+
a few things to be aware of. Functions have an internal name (for
30+
example `c::f00`) and a ‘pretty name’ (for example `f00`) and which is
31+
used depends on whether it is internal or being presented to the user.
32+
The `main` method is the ‘logical’ main which is not necessarily the
33+
main method from the code. In the output `NONDET` is use to represent a
34+
nondeterministic assignment to a variable. Likewise `IF` as a beautified
35+
`GOTO` instruction where the guard expression is used as the condition.
36+
`RETURN` instructions may be dropped if they precede an `END_FUNCTION`
37+
instruction. The comment lines are generated from the `locationt` field
38+
of the `instructiont` structure.
39+
40+
`goto-programs/` is one of the few places in the CPROVER codebase that
41+
templates are used. The intention is to allow the general architecture
42+
of program and functions to be used for other formalisms. At the moment
43+
most of the templates have a single instantiation; for example
44+
`goto_functionst` and `goto_function_templatet` and `goto_programt` and
45+
`goto_program_templatet`.
46+
47+
\section data_structures Data Structures
48+
49+
FIXME: This text is partially outdated.
50+
51+
The common starting point for working with goto-programs is the
52+
`read_goto_binary` function which populates an object of
53+
`goto_functionst` type. This is defined in `goto_functions.h` and is an
54+
instantiation of the template `goto_functions_templatet` which is
55+
contained in `goto_functions_template.h`. They are wrappers around a map
56+
from strings to `goto_programt`’s and iteration macros are provided.
57+
Note that `goto_function_templatet` (no `s`) is defined in the same
58+
header as `goto_functions_templatet` and is gives the C type for the
59+
function and Boolean which indicates whether the body is available
60+
(before linking this might not always be true). Also note the slightly
61+
counter-intuitive naming; `goto_functionst` instances are the top level
62+
structure representing the program and contain `goto_programt` instances
63+
which represent the individual functions. At the time of writing
64+
`goto_functionst` is the only instantiation of the template
65+
`goto_functions_templatet` but other could be produced if a different
66+
data-structures / kinds of models were needed for functions.
67+
68+
`goto_programt` is also an instantiation of a template. In a similar
69+
fashion it is `goto_program_templatet` and allows the types of the guard
70+
and expression used in instructions to be parameterised. Again, this is
71+
currently the only use of the template. As such there are only really
72+
helper functions in `goto_program.h` and thus `goto_program_template.h`
73+
is probably the key file that describes the representation of (C)
74+
functions in the goto-program format. It is reasonably stable and
75+
reasonably documented and thus is a good place to start looking at the
76+
code.
77+
78+
An instance of `goto_program_templatet` is effectively a list of
79+
instructions (and inner template called `instructiont`). It is important
80+
to use the copy and insertion functions that are provided as iterators
81+
are used to link instructions to their predecessors and targets and
82+
careless manipulation of the list could break these. Likewise there are
83+
helper macros for iterating over the instructions in an instance of
84+
`goto_program_templatet` and the use of these is good style and strongly
85+
encouraged.
86+
87+
Individual instructions are instances of type `instructiont`. They
88+
represent one step in the function. Each has a type, an instance of
89+
`goto_program_instruction_typet` which denotes what kind of instruction
90+
it is. They can be computational (such as `ASSIGN` or `FUNCTION_CALL`),
91+
logical (such as `ASSUME` and `ASSERT`) or informational (such as
92+
`LOCATION` and `DEAD`). At the time of writing there are 18 possible
93+
values for `goto_program_instruction_typet` / kinds of instruction.
94+
Instructions also have a guard field (the condition under which it is
95+
executed) and a code field (what the instruction does). These may be
96+
empty depending on the kind of instruction. In the default
97+
instantiations these are of type `exprt` and `codet` respectively and
98+
thus covered by the previous discussion of `irept` and its descendents.
99+
The next instructions (remembering that transitions are guarded by
100+
non-deterministic) are given by the list `targets` (with the
101+
corresponding list of labels `labels`) and the corresponding set of
102+
previous instructions is get by `incoming_edges`. Finally `instructiont`
103+
have informational `function` and `location` fields that indicate where
104+
they are in the code.
5105

6-
\author Kareem Khazem
7106

8107
\section goto-conversion Goto Conversion
9108

‎src/goto-symex/README.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,25 @@
11
\ingroup module_hidden
22
\defgroup goto-symex goto-symex
3-
# Folder goto-symex
43

4+
# Folder goto-symex
55

6-
\author Kareem Khazem
6+
\author Kareem Khazem, Martin Brain
7+
8+
This directory contains a symbolic evaluation system for goto-programs.
9+
This takes a goto-program and translates it to an equation system by
10+
traversing the program, branching and merging and unwinding loops as
11+
needed. Each reverse goto has a separate counter (the actual counting is
12+
handled by `cbmc`, see the `–unwind` and `–unwind-set` options). When a
13+
counter limit is reach, an assertion can be added to explicitly show
14+
when analysis is incomplete. The symbolic execution includes constant
15+
folding so loops that have a constant number of iterations will be
16+
handled completely (assuming the unwinding limit is sufficient).
17+
18+
The output of the symbolic execution is a system of equations; an object
19+
containing a list of `symex_target_elements`, each of which are
20+
equalities between `expr` expressions. See `symex_target_equation.h`.
21+
The output is in static, single assignment (SSA) form, which is *not*
22+
the case for goto-programs.
723

824
\section symbolic-execution Symbolic Execution
925

‎src/java_bytecode/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
\ingroup module_hidden
22
\defgroup java_bytecode java_bytecode
3+
34
# Folder java_bytecode
45

56

6-
This module provide a front end for Java.
7+
This module provides a bytecode-based front end for Java.

‎src/jsil/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\ingroup module_hidden
22
\defgroup jsil jsil
3-
# Folder jsil
43

4+
# Folder jsil
55

6-
`jsil/` is a module that focuses on type checking.
6+
`jsil/` contains a JavaScript front end.

‎src/json/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,4 @@
22
\defgroup json json
33
# Folder json
44

5-
`json/` is a utility that processes json.
5+
`json/` contains a JSON parser.

‎src/langapi/README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,12 @@
11
\ingroup module_hidden
22
\defgroup langapi langapi
3+
34
# Folder langapi
45

6+
\author Martin Brain
57

6-
`langapi/` is a language front end.
8+
`langapi/` contains the basic interfaces and support classes for programming
9+
language front ends. Developers only really need look at this if they
10+
are adding support for a new language. It’s main users are the
11+
language front-ends such as `ansi-c/` and
12+
`cpp/`.

‎src/linking/README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,11 @@
11
\ingroup module_hidden
22
\defgroup linking linking
3+
34
# Folder linking
45

5-
linking docs: todo
6+
\author Martin Brain
7+
8+
This allows multiple ‘object
9+
files’ (goto-programs) to be linked into one ‘executable’ (another
10+
goto-program), thus allowing existing build systems to be used to build
11+
complete goto-program binaries.

‎src/memory-models/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\ingroup module_hidden
22
\defgroup memory-models memory-models
3-
# Folder memory-models
43

4+
# Folder memory-models
55

6-
`memory-models` is a tool that works with memory.
6+
`memory-models` contains tools related to weak memory models.

‎src/miniz/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
\ingroup module_hidden
22
\defgroup miniz miniz
3-
Folder miniz
43

4+
# Folder miniz
55

6-
`miniz/` is a utility for minimizing things.
6+
`miniz/` contains a minimal ZIP compression library.

‎src/nonstd/README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
\ingroup module_hidden
22
\defgroup nonstd nonstd
3-
# Folder nonstd
43

4+
# Folder nonstd
55

6-
`nonstd` is a utility.
6+
`nonstd` contains implementations of C++ utilities that are not yet
7+
part of the standard library, e.g. for `optional`.

‎src/pointer-analysis/README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
11
\ingroup module_hidden
22
\defgroup pointer-analysis pointer-analysis
3+
34
# Folder pointer-analysis
45

5-
`pointer analysis/` is for static analysis.
6+
\author Martin Brain
7+
8+
To perform symbolic execution on programs with dereferencing of
9+
arbitrary pointers, some alias analysis is needed. `pointer-analysis`
10+
contains the three levels of analysis; flow and context insensitive,
11+
context sensitive and flow and context sensitive. The code needed is
12+
subtle and sophisticated and thus there may be bugs.

‎src/solvers/README.md

Lines changed: 52 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,9 @@
22
\defgroup solvers solvers
33
# Folder solvers
44

5+
\authors Romain Brenguier, Kareem Khazem, Martin Brain
56

6-
\authors Romain Brenguier, Kareem Khazem
7+
\section overview Overview
78

89
The basic role of solvers is to answer whether the set of equations given
910
is satisfiable.
@@ -16,7 +17,48 @@ The secondary role of solvers is to provide a satisfying assignment of
1617
the variables of the equations, this can for instance be used to construct
1718
a trace.
1819

19-
The most general solver in terms of supported equations is \ref solvers.
20+
The `solvers/` directory contains interfaces to a number of
21+
different decision procedures, roughly one per directory.
22+
23+
* prop/: The basic and common functionality. The key file is
24+
`prop_conv.h` which defines `prop_convt`. This is the base class that
25+
is used to interface to the decision procedures. The key functions are
26+
`convert` which takes an `exprt` and converts it to the appropriate,
27+
solver specific, data structures and `dec_solve` (inherited from
28+
`decision_proceduret`) which invokes the actual decision procedures.
29+
Individual decision procedures (named `*_dect`) objects can be created
30+
but `prop_convt` is the preferred interface for code that uses them.
31+
32+
* flattening/: A library that converts operations to bit-vectors,
33+
including calling the conversions in `floatbv` as necessary. Is
34+
implemented as a simple conversion (with caching) and then a
35+
post-processing function that adds extra constraints. This is not used
36+
by the SMT or CVC back-ends.
37+
38+
* dplib/: Provides the `dplib_dect` object which used the decision
39+
procedure library from “Decision Procedures : An Algorithmic Point of
40+
View”.
41+
42+
* cvc/: Provides the `cvc_dect` type which interfaces to the old (pre
43+
SMTLib) input format for the CVC family of solvers. This format is
44+
still supported by depreciated in favour of SMTLib 2.
45+
46+
* smt1/: Provides the `smt1_dect` type which converts the formulae to
47+
SMTLib version 1 and then invokes one of Boolector, CVC3, OpenSMT,
48+
Yices, MathSAT or Z3. Again, note that this format is depreciated.
49+
50+
* smt2/: Provides the `smt2_dect` type which functions in a similar
51+
way to `smt1_dect`, calling Boolector, CVC3, MathSAT, Yices or Z3.
52+
Note that the interaction with the solver is batched and uses
53+
temporary files rather than using the interactive command supported by
54+
SMTLib 2. With the `–fpa` option, this output mode will not flatten
55+
the floating point arithmetic and instead output the proposed SMTLib
56+
floating point standard.
57+
58+
* qbf/: Back-ends for a variety of QBF solvers. Appears to be no
59+
longer used or maintained.
60+
61+
* sat/: Back-ends for a variety of SAT solvers and DIMACS output.
2062

2163
\section sat-smt-encoding SAT/SMT Encoding
2264

@@ -302,3 +344,11 @@ This is done by generate_instantiations(messaget::mstreamt &stream, const namesp
302344
\copydetails check_axioms(const string_axiomst &axioms, string_constraint_generatort &generator, const std::function<exprt(const exprt &)> &get, messaget::mstreamt &stream, const namespacet &ns, std::size_t max_string_length, bool use_counter_example, ui_message_handlert::uit ui, const union_find_replacet &symbol_resolve)
303345
\link check_axioms(const string_axiomst &axioms, string_constraint_generatort &generator, const std::function<exprt(const exprt &)> &get, messaget::mstreamt &stream, const namespacet &ns, std::size_t max_string_length, bool use_counter_example, ui_message_handlert::uit ui, const union_find_replacet &symbol_resolve)
304346
(See function documentation...) \endlink
347+
348+
\section floatbv Floatbv Directory
349+
350+
This library contains the code that is used to convert floating point
351+
variables (`floatbv`) to bit vectors (`bv`). This is referred to as
352+
‘bit-blasting’ and is called in the `solver` code during conversion to
353+
SAT or SMT. It also contains the abstraction code described in the
354+
FMCAD09 paper.

‎src/util/README.md

Lines changed: 78 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,86 @@
11
\ingroup module_hidden
22
\defgroup util util
3+
34
# Folder util
45

6+
\author Martin Brain
7+
8+
\section data_structures Data Structures
9+
10+
This section discusses some of the key data-structures used in the
11+
CPROVER codebase.
12+
13+
\subsection irept Irept Data Structure
14+
15+
There are a large number of kind of tree structured or tree-like data in
16+
CPROVER. `irept` provides a single, unified representation for all of
17+
these, allowing structure sharing and reference counting of data. As
18+
such `irept` is the basic unit of data in CPROVER. Each `irept`
19+
contains[^2] a basic unit of data (of type `dt`) which contains four
20+
things:
21+
22+
* `data`: A string[^3], which is returned when the `id()` function is
23+
used.
24+
25+
* `named_sub`: A map from `irep_namet` (a string) to an `irept`. This
26+
is used for named children, i.e. subexpressions, parameters, etc.
27+
28+
* `comments`: Another map from `irep_namet` to `irept` which is used
29+
for annotations and other ‘non-semantic’ information
30+
31+
* `sub`: A vector of `irept` which is used to store ordered but
32+
unnamed children.
33+
34+
The `irept::pretty` function outputs the contents of an `irept` directly
35+
and can be used to understand an debug problems with `irept`s.
36+
37+
On their own `irept`s do not “mean” anything; they are effectively
38+
generic tree nodes. Their interpretation depends on the contents of
39+
result of the `id` function (the `data`) field. `util/irep_ids.txt`
40+
contains the complete list of `id` values. During the build process it
41+
is used to generate `util/irep_ids.h` which gives constants for each id
42+
(named `ID_`). These can then be used to identify what kind of data
43+
`irept` stores and thus what can be done with it.
44+
45+
To simplify this process, there are a variety of classes that inherit
46+
from `irept`, roughly corresponding to the ids listed (i.e. `ID_or`
47+
(the string `"or”`) corresponds to the class `or_exprt`). These give
48+
semantically relevant accessor functions for the data; effectively
49+
different APIs for the same underlying data structure. None of these
50+
classes add fields (only methods) and so static casting can be used. The
51+
inheritance graph of the subclasses of `irept` is a useful starting
52+
point for working out how to manipulate data.
53+
54+
There are three main groups of classes (or APIs); those derived from
55+
`typet`, `codet` and `exprt` respectively. Although all of these inherit
56+
from `irept`, these are the most abstract level that code should handle
57+
data. If code is manipulating plain `irept`s then something is wrong
58+
with the architecture of the code.
59+
60+
Many of the key descendent of `exprt` are declared in `std_expr.h`. All
61+
expressions have a named subfield / annotation which gives the type of
62+
the expression (slightly simplified from C/C++ as `unsignedbv_typet`,
63+
`signedbv_typet`, `floatbv_typet`, etc.). All type conversions are
64+
explicit with an expression with `id() == ID_typecast` and an ‘interface
65+
class’ named `typecast_exprt`. One key descendent of `exprt` is
66+
`symbol_exprt` which creates `irept` instances with the id of “symbol”.
67+
These are used to represent variables; the name of which can be found
68+
using the `get_identifier` accessor function.
69+
70+
`codet` inherits from `exprt` and is defined in `std_code.h`. They
71+
represent executable code; statements in C rather than expressions. In
72+
the front-end there are versions of these that hold whole code blocks,
73+
but in goto-programs these have been flattened so that each `irept`
74+
represents one sequence point (almost one line of code / one
75+
semi-colon). The most common descendents of `codet` are `code_assignt`
76+
so a common pattern is to cast the `codet` to an assignment and then
77+
recurse on the expression on either side.
78+
79+
[^2]: Or references, if reference counted data sharing is enabled. It is
80+
enabled by default; see the `SHARING` macro.
581

6-
`util/` is filled with useful tools.
82+
[^3]: Unless `USE_STD_STRING` is set, this is actually
83+
a `dstring` and thus an integer which is a reference into a string table
784

885
\section command-line-parsing Command Line Parsing
986

‎src/xmllang/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,16 @@
11
\ingroup module_hidden
22
\defgroup xmllang xmllang
3+
34
# Folder xmllang
45

6+
\author Martin Brain
7+
58
`xmllang` is a utility for converting to XML.
69

10+
CPROVER has optional XML output for results and there is an XML format
11+
for goto-programs. It is used to interface to various IDEs. The
12+
`xmllang/` directory contains the parser and helper functions for
13+
handling this format.
14+
715
Based on grammar/tokenizer from http://www.w3.org/XML/9707/xml-in-c.tar.gz
816
also see http://www.w3.org/XML/9707/XML-in-C

0 commit comments

Comments
 (0)
Please sign in to comment.