Skip to content

Commit 849f256

Browse files
Merge pull request diffblue#2770 from smowton/smowton/docs/symbol-table
Document symbol table and namespace
2 parents 5e8308f + b227eee commit 849f256

File tree

1 file changed

+97
-13
lines changed

1 file changed

+97
-13
lines changed

src/util/README.md

Lines changed: 97 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -148,19 +148,103 @@ of a while loop is a statement.
148148

149149
\subsection symbolt_section symbolt, symbol_tablet, and namespacet
150150

151-
To be documented.
152-
153-
\subsubsection symbol_lifetimes_section Symbol lifetimes, symbol modes, name, base-name, pretty-name; semantic of lifetimes for symex?
154-
155-
To be documented.
156-
157-
\subsubsection storing_symbols_section Storing symbols and hiding symbols (namespacet)
158-
159-
To be documented.
160-
161-
\subsubsection ns_follow_section ns.follow
162-
163-
To be documented.
151+
A symbol table is a mapping from symbol names to \ref symbolt objects, which
152+
store a symbol's name, attributes, type and perhaps value. They are used to
153+
describe local and global variables, type definitions and function prototypes
154+
and definitions.
155+
156+
All symbols store a type (an instance of \ref typet). For function or method
157+
symbols these are \ref code_typet instances.
158+
159+
Global variable symbols may have a value (an \ref exprt), in which case it is
160+
used to initialise the global.
161+
162+
Method or function symbols may also have a value, in which case it is a
163+
\ref codet and gives the function definition. A method or function symbol
164+
without a value is a prototype (for example, it might be an `extern` declaration
165+
in C). A function symbol that has been converted to a GOTO function *may* be
166+
replaced with a special "compiled" value, but this varies from driver program to
167+
program -- at the time of writing, only \ref goto-cc does this.
168+
169+
Local variables' symbol values are always ignored;
170+
any initialiser must be explicitly assigned after they are instantiated by a
171+
declaration (\ref code_declt).
172+
173+
Symbol expressions (\ref symbol_exprt) and types (\ref symbol_typet) refer to
174+
symbols stored in a symbol table. Symbol expressions can be thought of as
175+
referring to the table for more detail about a symbol (for example, is it a
176+
local or a global variable, or perhaps a function?), and have a type which must
177+
match the type given in the symbol table. Symbol types can be thought of as
178+
shorthands or aliases for a type given in more detail in the symbol table, for
179+
example permitting a shorthand for a large structure type, as well as permitting
180+
the construction of expressions referring to that type before its full
181+
definition is known.
182+
183+
Note the implementation of \ref symbol_tablet is split into a base interface
184+
(\ref symbol_table_baset) and an implementation (\ref symbol_tablet). There is
185+
one alternate implementation (\ref journalling_symbol_tablet) which additionally
186+
maintains a log or journal of symbol creation, modification and deletions.
187+
188+
Namespaces (\ref namespacet) provide a read-only view on one or more symbol
189+
tables, and provide helper functions that aid accessing the table. A namespace
190+
may layer one or more symbol tables, in which case any lookup operation checks
191+
the 'top' symbol table before moving down the layers towards the 'bottom' symbol
192+
table, looking up the target symbol name in each successive table until one is
193+
found. Note class \ref multi_namespacet can layer arbitrary numbers of symbol
194+
tables, while for historical reasons \ref namespacet can layer up to two.
195+
196+
The namespace wrapper class also provides the \ref namespacet::follow
197+
operation, which dereferences a `symbol_typet` to retrieve the type it refers
198+
to, including following a symbol_typet which refers to another symbol which
199+
eventually refers to a 'real' type.
200+
201+
\subsubsection symbolt_lifetime Symbol lifetimes
202+
203+
Symbols with \ref symbolt::is_static_lifetime set are initialised before a
204+
program's entry-point is called and live until it ends. Such long-lived
205+
variables are used to implement globals, but also C's procedure-local static
206+
variables, which have restricted visiblity but the lifetime of a global.
207+
They may be marked dead using a \ref code_deadt instruction, but this does not
208+
render the variable inaccessible, it only indicates that the variable's current
209+
value will not be read without an intervening write.
210+
211+
Non-type, non-global symbols (those with \ref symbolt::is_static_lifetime and
212+
\ref symbolt::is_type not set) are local variables, and their lifespan
213+
description varies depending on context.
214+
215+
In symbol table function definitions (the values of function-typed symbols),
216+
local variables are created using a \ref code_declt instruction, and live until
217+
their enclosing \ref code_blockt ends (similar to the C idiom
218+
`{ int x; ... } // x lifespan ends`). By contrast in GOTO programs locals are
219+
declared using a DECL instruction and live until a DEAD instruction referring
220+
to the same symbol. Explicit DEAD instructions are always used rather than
221+
killing implicitly by exiting a function.
222+
223+
Multiple instances of the same local may be live at the same time by virtue of
224+
recursive function calls; a dead instruction or scope end always targets the
225+
most recently allocated instance.
226+
227+
Scoping rules are particular to source languages and are not enforced by
228+
CPROVER. For example, in standard C it is not possible to refer to a local
229+
variable across functions without using a pointer, but in some possible source
230+
languages this is permitted.
231+
232+
\subsubsection symbolt_details Symbol details
233+
234+
Symbols have:
235+
* A mode, which indicates the source language frontend responsible for creating
236+
them. This is mainly used in pretty-printing the symbol table, to indicate
237+
the appropriate language frontend to use rendering the symbol's value and/or
238+
type. For example, mode == ID_C == "C" indicates that \ref ansi_ct, the C
239+
front-end, should be used to pretty-print, which in turn delegates to
240+
\ref expr2ct.
241+
* A base-name and pretty-name, which are a short and user-friendly version of
242+
the symbol's definitive name respectively.
243+
* Several flags (see \ref symbolt for full details), including
244+
\ref symbolt::is_static_lifetime (is this a global variable symbol?),
245+
\ref symbolt::is_type (is this a type definition),
246+
\ref symbolt::is_thread_local (of a variable, are there implicitly instances
247+
of this variable per-thread?).
164248

165249
\subsection cmdlinet
166250

0 commit comments

Comments
 (0)