|
| 1 | +# Central Data Structures |
| 2 | + |
| 3 | +The following is some light technical documentation of the major data structures |
| 4 | +represented in the input transformation to Intermediate Representation (IR) inside |
| 5 | +CBMC and the assorted CProver tools. |
| 6 | + |
| 7 | +## goto_modelt |
| 8 | + |
| 9 | +The `goto_modelt` is the main data structure that CBMC (and the other tools) use to |
| 10 | +represent GOTO-IR (the `GOTO` Intermediate Representation). |
| 11 | + |
| 12 | +A `goto_modelt` is effectively a product type, consisting of: |
| 13 | + |
| 14 | +* A list of GOTO functions (pseudocode: `type goto_functionst = list<goto_functiont>`) |
| 15 | +* A symbol table containing symbol references for the symbols contained in the |
| 16 | + GOTO functions (pseudocode: `type symbol_tablet = map<identifier, symbolt>`). |
| 17 | + |
| 18 | +The abstract interface of `goto_modelt` is outlined in the file |
| 19 | +[`src/goto-programs/abstract_goto_model.h`](../../src/goto-programs/abstract_goto_model.h). |
| 20 | + |
| 21 | +## goto_functiont |
| 22 | + |
| 23 | +A `goto_functiont` is also a product type. It's designed to represent a function |
| 24 | +at the IR level, and effectively it's the following ADT (in pseudocode): |
| 25 | + |
| 26 | +```js |
| 27 | +type goto_functiont { |
| 28 | + goto_programt body |
| 29 | + list<identifiers> parameters |
| 30 | +} |
| 31 | +``` |
| 32 | + |
| 33 | +Of the two types listed above, the `parameters` one should be self-documenting, |
| 34 | +(the values of these are later looked up in the symbol table), so we're going to |
| 35 | +focus next on the type `goto_programt` |
| 36 | + |
| 37 | +## goto_programt |
| 38 | + |
| 39 | +A `goto_programt` is a container of GOTO-IR instructions. In pseudocode, it would |
| 40 | +look like `type goto_programt = list<goto_instructiont>`. |
| 41 | + |
| 42 | +An instruction (`goto_instructiont`) is another product type, describing a GOTO level |
| 43 | +instruction with the following 3 component subtypes, again in pseudocode: |
| 44 | + |
| 45 | +```js |
| 46 | +type goto_instructiont { |
| 47 | + instruction_type instr_type |
| 48 | + instruction statement |
| 49 | + guard boolean_expr |
| 50 | +} |
| 51 | +``` |
| 52 | + |
| 53 | +The above subtypes are just illustrative, so we will provide some extra explanation for |
| 54 | +these: |
| 55 | + |
| 56 | +* The `instruction_type` above corresponds to the `goto_program_instruction_typet` type |
| 57 | + listed in [`goto_program.h`](../../src/goto-programs/goto_program.h) |
| 58 | + * For illustration purposes, some instruction types are `assign`, `function_call`, `return`, |
| 59 | + etc. |
| 60 | +* The `instruction` is a statement represented by a `goto_instruction_codet`. |
| 61 | + * A `goto_instruction_codet` is a `codet` (a data structure broadly representing a statement |
| 62 | + inside CBMC) that contains the actual GOTO-IR instruction. |
| 63 | + * You can distinguish different statements by inspecting the `irep` element `ID_statement`. |
| 64 | +* The `guard` is an `exprt` (A CBMC data structure broadly representing an expression) |
| 65 | + that is expected to have type `bool`. |
| 66 | + * This is optional - not every instruction is expected to have a `guard` associated with it. |
| 67 | + |
| 68 | +## source_locationt |
| 69 | + |
| 70 | +Another important data structure that needs to be discussed at this point is |
| 71 | +`source_locationt`. |
| 72 | + |
| 73 | +This is an `irept`. `irep`s are the central data structure that model most entities inside |
| 74 | +CBMC and the assorted tools - effectively a node/map like data structure that forms a hierachical |
| 75 | +tree that ends up modeling graphs like ASTs, CFGs, etc. (This will be further discussed in |
| 76 | +a dedicated page). |
| 77 | + |
| 78 | +`source_locationt` are attached into various `exprt`s (the data structure representing |
| 79 | +various expressions, usually the result of some early processing, e.g. the result of the |
| 80 | +frontend parsing a file). |
| 81 | + |
| 82 | +This means that `codet`s, representing GOTO-IR instructions *also* have a `source_locationt` |
| 83 | +attached to them, by virtue of inheriting from `exprt`. |
| 84 | + |
| 85 | +For the most part, `source_locationt` is something that is only being used when we print |
| 86 | +various nodes (for property listing, counterexample/trace printing, etc). |
| 87 | + |
| 88 | +It might be possible that a specific source location might point to a CBMC instrumentation |
| 89 | +primitive (which might be reported as a `built-in-addition`) or there might even be no-source |
| 90 | +location (because it might be part of harnesses generated as an example, that have no presence |
| 91 | +in the user code). |
0 commit comments