3
3
4
4
# Folder util
5
5
6
- \author Martin Brain
6
+ \author Martin Brain, Owen Jones
7
7
8
8
\section data_structures Data Structures
9
9
@@ -12,17 +12,17 @@ CPROVER codebase.
12
12
13
13
\subsection irept Irept Data Structure
14
14
15
- There are a large number of kind of tree structured or tree-like data in
15
+ There are a large number of kinds of tree structured or tree-like data in
16
16
CPROVER. ` irept ` provides a single, unified representation for all of
17
17
these, allowing structure sharing and reference counting of data. As
18
18
such ` irept ` is the basic unit of data in CPROVER. Each ` irept `
19
- contains[ ^ 2 ] a basic unit of data (of type ` dt ` ) which contains four
19
+ contains[ ^ 1 ] a basic unit of data (of type ` dt ` ) which contains four
20
20
things:
21
21
22
- * ` data ` : A string[ ^ 3 ] , which is returned when the ` id() ` function is
22
+ * ` data ` : A string[ ^ 2 ] , which is returned when the ` id() ` function is
23
23
used.
24
24
25
- * ` named_sub ` : A map from ` irep_namet ` (a string) to an ` irept ` . This
25
+ * ` named_sub ` : A map from ` irep_namet ` (a string) to ` irept ` . This
26
26
is used for named children, i.e. subexpressions, parameters, etc.
27
27
28
28
* ` comments ` : Another map from ` irep_namet ` to ` irept ` which is used
@@ -32,7 +32,7 @@ things:
32
32
unnamed children.
33
33
34
34
The ` irept::pretty ` function outputs the contents of an ` irept ` directly
35
- and can be used to understand an debug problems with ` irept ` s.
35
+ and can be used to understand and debug problems with ` irept ` s.
36
36
37
37
On their own ` irept ` s do not “mean” anything; they are effectively
38
38
generic tree nodes. Their interpretation depends on the contents of
@@ -67,21 +67,34 @@ class’ named `typecast_exprt`. One key descendent of `exprt` is
67
67
These are used to represent variables; the name of which can be found
68
68
using the ` get_identifier ` accessor function.
69
69
70
- ` codet ` inherits from ` exprt ` and is defined in ` std_code.h ` . They
71
- represent executable code; statements in C rather than expressions. In
70
+ ` codet ` inherits from ` exprt ` and is defined in ` std_code.h ` . It
71
+ represents executable code; statements in C rather than expressions. In
72
72
the front-end there are versions of these that hold whole code blocks,
73
73
but in goto-programs these have been flattened so that each ` irept `
74
74
represents one sequence point (almost one line of code / one
75
75
semi-colon). The most common descendents of ` codet ` are ` code_assignt `
76
76
so a common pattern is to cast the ` codet ` to an assignment and then
77
77
recurse on the expression on either side.
78
78
79
- [ ^ 2 ] : Or references, if reference counted data sharing is enabled. It is
79
+ [ ^ 1 ] : Or references, if reference counted data sharing is enabled. It is
80
80
enabled by default; see the ` SHARING ` macro.
81
81
82
- [ ^ 3 ] : Unless ` USE_STD_STRING ` is set, this is actually
82
+ [ ^ 2 ] : Unless ` USE_STD_STRING ` is set, this is actually
83
83
a ` dstring ` and thus an integer which is a reference into a string table
84
84
85
+ \subsection irep_idt Irep_idt and Dstringt
86
+
87
+ Inside ` irept ` s, strings are stored as ` irep_idt ` s, or ` irep_namet ` s for
88
+ keys to ` named_sub ` or ` comments ` . If ` USE_STD_STRING ` is set then both
89
+ ` irep_idt ` and ` irep_namet ` are ` typedef ` ed to ` std::string ` , but by default
90
+ it is not set and they are ` typedef ` ed to ` dstringt ` . ` dstringt ` has one
91
+ field, an unsigned integer which is an index into a static table of strings.
92
+ This makes it expensive to create a new string (because you have to look
93
+ through the whole table to see if it is already there, and add it if it
94
+ isn't) but very cheap to compare strings (just compare the two integers). It
95
+ also means that when you have lots of copies of the same string you only have
96
+ to store the whole string once, which saves space.
97
+
85
98
\dot
86
99
digraph G {
87
100
node [ shape=box] ;
0 commit comments