Skip to content

DOC-56: An overview of grapht and dependence_grapht in corresponding README.md. #4095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 6, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 59 additions & 1 deletion src/analyses/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,65 @@ To be documented.

\subsection analyses-dependence-graph Data- and control-dependence analysis (dependence_grapht)

To be documented.
### Dependence graph

Implemented in `src/analyses/dependence_graph.h(cpp)`. It is a graph and an
abstract interpreter at the same time. The abstract interpretation nature
allows a dependence graph to [build itself](#Construction)
(the graph) from a given GOTO program.

A dependence graph extends the class `grapht` with `dep_nodet` as the type of
nodes (see `src/util/graph.h` for more details about
[graph representation]("../util/README.md")).
The `dep_nodet` extends `graph_nodet<dep_edget>` with an iterator to a GOTO
program instruction. It means that each graph node corresponds to a particular

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "It" here?

Copy link
Contributor Author

@marek-trtik marek-trtik Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The information of the previous sentence - the fact that each node stores an iterator to a GOTO instruction.

program instruction. A labelled edge `(u, v)` of a dependence graph expresses
a dependency of the program instruction corresponding to node `u` on the program
instruction corresponding to node `v`. The label of the edge (data of the
type `dep_edget` attached to the edge) denotes the kind of dependency. It can be
`control-dependency`, `data-dependency`, or both.

#### Control dependency

An instruction `j` corresponding to node `v` is control-dependent on instruction
`i` corresponding to node `u` if and only if `j` post-dominates at least one
but not all successors instructions of `i`.

An instruction `j` post-dominates an instruction `i` if and only if each
execution path going through `i` eventually reaches `j`.

Post-dominators analysis is implemented in `src/analyses/cfg_dominators.h(cpp)`.

#### Data dependency

The instruction `j` corresponding to node `v` is data-dependent on the
instruction `i` corresponding to node `u` if and only if `j` may read data from

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the

the memory location defined (i.e. written) by `i`.

The reaching definitions analysis is used together with the read-write ranges
analysis to check whether one instruction may read data writen by another
instruction. For more details see `src/analyses/reaching_definitions.h(cpp)` and
`src/analyses/goto_rw.h(cpp)`.

#### Construction

The dependence graph extends the standard abstract interpreter class `ait`
with post-dominators analysis and reaching definitions analysis. The domain of
the abstract interpreter is defined in the class `dep_graph_domaint`.

For each instruction `i` an instance of `dep_graph_domaint` associated with `i`
is created. The instance stores a set `control_deps` of program
instructions the instruction `i` depends on via control-dependency, and a set
`data_deps` of program instructions the instruction `i` depends on via
data-dependency. These sets are updated (increased) during the computation,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this last sentence is too complicated, but I can't figure out how to reword it. Does anyone have a good suggestion for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to make clear we are talking about an abstract domain, and we have domain elements. Each domain element is made up of sets of control- and data dependencies (control_deps and data_deps, respectively).

until a fix-point is reached.

The construction of a dependence graph is started by calling its `initialize`
method and then, once a fix-point is reached by the abstract interpreter, the
method `finalize` converts data in the interpreter's domain (i.e. from
`dep_graph_domaint` instances) into edges of the graph. Nodes of the graph are
created during the run of the abstract interpreter; they are linked to the
corresponding program instructions.

\subsection analyses-dirtyt Address-taken lvalue analysis (dirtyt)

Expand Down
45 changes: 45 additions & 0 deletions src/util/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -383,3 +383,48 @@ _not_ be extended - but it may be helpful to be aware of where this happens.
These are in the process of being removed - no new output should go
via `std::cout` or `std::cerr`, but should instead use the
\ref messaget and \ref message_handlert infrastructure.


\subsection Graph
### Graph

Implemented in `src/util/graph.h` as `grapht` class. The `grapht` class
represents a directed graph. However, an undirected graph can be emulated by
inserting for each edge (u, v) also (v, u). A multi-graph is not supported
though, because parallel edges are not allowed between vertices.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this can be simplified to "The grapht class represents a directed graph"; I don't think there's normally an expectation that any graph representation can also represent a multi graph directly, and a multi graph can be embedded in a directed graph in a similar way to how undirected graphs can be represented, so I don't necessarily think it's worth calling this out as a deficiency.


#### Data representation

A graph is defined by a template class `grapht<N>`, where `N` is the type of the
nodes of the graph. The class `grapht` stores the nodes in a vector. A user of
the graph is supposed to access the nodes via their indices in the vector. The
node type `N` must meet these requirements:
- It must be default constructible; the common way how to insert a node to the
graph is to call method `add_node` which pushes a new default-constructed
instance of `N` to the end of the vector and returns its index.
Then `operator[]` can be used to obtain a reference to the pushed instance and
set its content as desired.
- It must define a type `edget`, representing type of data attached to any edge

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we do it this way; I can't see any obvious reason why the edge type necessarily follows from the node type. It seems the type parameter here doesn't specify the node type, but actually both the node type and the edge type? Maybe it would be worth refactoring the graph to work like this grapht<graph_configt> { using node_t = typename graph_configt::node_t; using edge_t = graph_configt::edge_t; // ... }? Just a thought

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we just define edges and node types at the same time. But of course this can be improved upon, and improvements are always welcome :-)

of the graph; in case edges should not have any data attached, then one can
use a predefined class `empty_edget`. The type must be default-constructible.
- It must define a type `edgets` as an associative container compatible with the
type `std::map<std::size_t, edget>`.
- It must define publicly accessible members of the type `edgest`, named `in`
and `out`. The members represent edges in the graph. Namely, if `u` is an
index of some node in the graph and `v` is a key in the map `in` (resp. `out`)
of that node, then `(v, u)` (resp. `(u, v)`) represent an edge in the graph,
with attached data `in[v]` (resp. `out[v]`).
- It must define methods `add_in`, `erase_in`, and `add_out`, `erase_out` for
insertion and removal of values to and from the maps `in` and `out`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we bother having this stuff on the node type at all if all nodes are required to implement them in the same way anyway

- It must define a method `pretty` converting the node to a "pretty" string.

One can use a predefined template class `graph_nodet<E>` as the template

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is E here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

parameter `N` of the graph. The template parameter `E` allows to define the type
`edget`.

#### Graph algorithms

The graph implementation comes with several graph algorithms. We describe each
of them in following paragraphs.

TODO!