Aggressive slicer #1587

polgreen · 2017-11-14T10:16:44Z

This pull request is dependent on:
#1586
#1509

The aggressive slicer by default removes function bodies of any functions not on the shortest path on the call graph between the start function and the property. If no property is specified, it will preseve the shortest path for each property. It is parameterisable:

--call-depth " preserves all functions within function calls of the shorets path " --preserve-function forces the aggressive slicer to preserve function
--preserve-function containing force the aggressive slicer to preserve all functions with names containing
--preserve-all-direct-paths force the aggressive slicer to preserve all functions on direct paths to the property. This option must be used with a specified property.

The aggressive slicer is designed to be used in conjunction with:
#1585 - to over-approximate any function bodies we have removed
#1566 - to block functions we do not wish to be included in paths

The results are not sound, it may produce spurious traces, due to over-approximating the removed function bodies, and it may miss traces due to the removed function bodies writing to global variables, or if "preserve-all-direct-paths" is not used. However, it is designed for use with code bases that are far too big to otherwise use cbmc on.

The parameterisation is intended to be used as a means of providing engineer feedback.

Without the "digraph call_graph{" this is not parsable dot. This change makes the output dot consistent with the previous output_dot function in call_grapht

…it from grapht<T> I have rewritten call_grapht to inherit from grapht, because I think there should be fewer graph classes in CBMC. Instead of constructing the entire call graph, the reachable call graph class constructs only the reachable call graph. There is some duplication between the call_graph class and the show_call_sequences function, however the show_call_sequences function outputs the graph on the fly, instead of constructing it, which doesn't allow for further use of the graph

Michael's comments on CR-766007 highlighted things that should be changed in grapht.

Returns the list of functions on the shortest path from a src function to a destination function on the function call graph.

Automatically removes the function bodies of any function not on the shortest path in the function call graph or reachable within N function calls of the shortest path

…obal inits

Adds call_grapht::reachable_within_n_steps. Function is passed an unordered set of function names and a number of steps, N,, and adds any function reachable within N function calls from the original set of functions, to the set of function names

returns the source_locationt for a property irep_idt

Extends the aggressive slicer with an option --preserve-all-direct-paths This preserves all functions on direct, loop free, paths from the start function to the function containing the property, and removes the function bodies of all other functions. This significantly reduces the size of the binary whilst ensuring that no counter-examples (possible with the given unwinding limit) are missed (although the counterexamples produced may still be spurious)

polgreen · 2017-11-14T11:44:58Z

The clang format patch for this pull request would change a lot of code that I haven't touched, e.g., the GOTO_INSTRUMENT_OPTIONS. Should I apply this patch or not? Is there a way to selectively run clang-format on only sections of the file that are changed?

smowton · 2017-11-14T13:52:45Z

AFAIK no, but there is such a facility for our own linter -- see scripts/travis_lint.sh, which uses run_diff.sh to achieve this by running the linter per file and then filtering by git-diff. That script could be adapted to filter clang-format warnings instead of linter warnings? It wouldn't filter the patch produced by git-clang-format, but rather the line-by-line warnings produced by clang-format.

martin-cs · 2017-11-15T18:07:28Z

@polgreen clang format should only look at the code in the diff. If not then something is wrong. Please don't fix it by just applying the diff it says!

martin-cs · 2017-11-17T14:19:12Z

A few general observations:

This sounds like an interesting bit of functionality but one that requires a combination of options / passes to use as intended. Please could we have an addition to the manual that describes it.
No chance of a regression test or two to go with it is there?
As a general rule, new bits of code are easier to merge if we can be sure that they are independent of existing functionality. I suspect that to be the case here; can you help with the argument of why?

martin-cs

Should be OK.

martin-cs · 2017-11-17T14:21:31Z

src/analyses/call_graph.cpp

-}
-
-call_grapht::call_grapht(const goto_functionst &goto_functions)
-{


Why remove this constructor?

Ah, this is left over from call graph changes. Will revert

martin-cs · 2017-11-17T14:23:43Z

src/analyses/call_graph.cpp

+void call_grapht::output_xml(std::ostream &out) const
+{
+  for(node_indext n = 0; n < nodes.size(); n++)
+    output_xml_node(out, n);
 }


Would asking for an output_json be unreasonable?

Not unreasonable, it should also be added to grapht, which doesn't have one either.

That'd be lovely.

martin-cs · 2017-11-17T14:29:20Z

src/analyses/call_graph.cpp

  }
+
+  return result;
 }


How much work would it be to move these algorithms out into util/graph.h. It seems like this might not be the only place we want to do a shortest path or bounded depth search.

There is already a branch by Smowton that combines my call graph pull request and his, so I think I should probably incorporate this change into that.

OK. If the two of you get together on this I'd be happy to review.

martin-cs · 2017-11-17T14:33:47Z

src/analyses/reachable_call_graph.cpp

+    }
+  }
+  INVARIANT(!worklist.empty(), "destination function not found");
+


This feels a little more like a precondition or possibly a viable user error and thus not an invariant.

martin-cs · 2017-11-17T14:34:07Z

src/analyses/reachable_call_graph.h

+#include "call_graph.h"
+
+class reachable_call_grapht: public call_grapht
+{


Seems a reasonable use of inheritance.

martin-cs · 2017-11-17T14:35:32Z

src/goto-instrument/goto_instrument_parse_options.cpp

      else
-        call_graph.output(std::cout);
+        call_graph.output_dot(std::cout);



In some ways it would be nice to keep output as text, output_dot as dot (and requiring the dot option), output_xml as xml (and requiring the xml option) and so on.

martin-cs · 2017-11-17T14:37:12Z

src/goto-instrument/goto_instrument_parse_options.h

-
+  "(aggressive-slice)" \
+  "(call-depth):" \
+  "(harness-generator):" \


I may have missed it, but what does this option do?

Ah, that shouldn't be in this pull request.

martin-cs · 2017-11-17T14:38:25Z

src/goto-instrument/remove_function.h

+/// and those that contain a given irep_idt snippet
+/// If no properties are set by the user, we preserve all functions on
+/// the shortest paths to each property.
+class aggressive_slicert


Maybe this should have it's own header / implementation file.

martin-cs · 2017-11-17T14:40:05Z

src/goto-programs/slice_global_inits.cpp

-  }
-  while(!worklist.empty());
+  std::unordered_set<irep_idt, irep_id_hash> functions_reached=
+      call_graph.reachable_functions(entry_point);



Are there other uses of the callgraph like this? It seems finding reachable functions might be a common thing to do.

As far as I can find, this is the only bit of code that uses this using a graph.

show_call_sequences outputs the function calls, but does it on the fly without constructing the graph.

find_used_functions obviously needs to compute the reachable functions, but again does it on the fly without constructing a graph.

OK. I guess I was suggesting that it should become an API and maybe in time these / others could be encouraged to use it.

polgreen · 2017-11-20T17:21:37Z

@martin-cs

yes will write something for the manual. Have emailed you about regression tests, not sure how to write tests for goto-instrument. I have some tests ready, but I don't know how to turn them into something that fits in the test framework.

RE independence: do you mean that it doesn't duplicate functionality of existing features? I implemented this because none of the existing features would scale to the code I was looking at; we tried various combinations of the reachability slicer, the full-slicer, the global-init slicer, and nothing produced a binary small enough for cbmc to analyse.

Or do you mean that it will work, independently of changes to existing functionality of CBMC? I have implemented this so that when called, goto-instrument does a slice of global-inits before performing the aggressive-slice and a reachability-slice afterwards. I could remove that feature, but it wouldn't relaly make sense to use the aggressive slicer without running the reachability slicer afterwards, as it creates a lot of unreachable code.

polgreen · 2017-11-21T14:24:42Z

There seem to be two versions of the manual:
http://www.cprover.org/cbmc/doc/manual.pdf - the old one, but this does actually mention slicing.
http://www.cprover.org/cprover-manual/ - I guess that this is the up to date version, but it doesn't have any documentation of the existing slicing options

I assume I should add to the second one by editing https://github.com/diffblue/cbmc/blob/develop/doc/cprover-manual.md?

martin-cs · 2017-11-21T15:48:26Z

http://www.cprover.org/cprover-manual/ is the correct one, which I believe to be generated from
https://github.com/diffblue/cbmc/blob/develop/doc/cprover-manual.md by processes unknown to me.

Regression tests : discussed offline.

Independence : what I mean is "does this patch set alter the functionality of existing code or only add to it". If it is clearly only new functionality (which, one would hope, is different and distinct to existing things) then it is easier to review as we don't have to worry about breaking things.

polgreen added 11 commits November 14, 2017 10:06

grapht::output_dot needs digraph declaration

eea9fa0

Without the "digraph call_graph{" this is not parsable dot. This change makes the output dot consistent with the previous output_dot function in call_grapht

CR fixes for grapht

5682636

Michael's comments on CR-766007 highlighted things that should be changed in grapht.

return shortest path from function call graph

ab074ea

Returns the list of functions on the shortest path from a src function to a destination function on the function call graph.

Documentation of call graph and reachable call graph

d351506

white space call graph

13ad1b0

Aggressive slicer: automatically stub functions not on shortest path

8fb9793

Automatically removes the function bodies of any function not on the shortest path in the function call graph or reachable within N function calls of the shortest path

Aggressive slice should also do a reachability slice and a slice of g…

50aed74

…obal inits

Find source location from a property irep_idt

1eeacb0

returns the source_locationt for a property irep_idt

polgreen requested review from kroening, martin-cs, peterschrammel, smowton, tautschnig and thk123 as code owners November 14, 2017 10:16

martin-cs requested changes Nov 17, 2017

View reviewed changes

polgreen added 3 commits November 21, 2017 11:45

linter fix: line length

6ce0cd8

Aggressive slicer regression tests

fbc0478

re-introduce constructor

35ee9e4

polgreen requested a review from chrisr-diffblue as a code owner November 21, 2017 14:25

tautschnig assigned polgreen Nov 30, 2017

This was referenced Jun 20, 2018

Disconnect unreachable nodes in a graph #2369

Merged

Depth limited search #2381

Merged

Aggressive slicer v2 #2385

Merged

polgreen closed this Jul 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggressive slicer #1587

Aggressive slicer #1587

polgreen commented Nov 14, 2017

polgreen commented Nov 14, 2017

smowton commented Nov 14, 2017

martin-cs commented Nov 15, 2017

martin-cs commented Nov 17, 2017

martin-cs left a comment

martin-cs Nov 17, 2017

polgreen Nov 20, 2017

martin-cs Nov 17, 2017

polgreen Nov 21, 2017

martin-cs Nov 21, 2017

martin-cs Nov 17, 2017

polgreen Nov 21, 2017

martin-cs Nov 21, 2017

martin-cs Nov 17, 2017

martin-cs Nov 17, 2017

martin-cs Nov 17, 2017

martin-cs Nov 17, 2017

polgreen Nov 20, 2017

martin-cs Nov 21, 2017

martin-cs Nov 17, 2017

martin-cs Nov 17, 2017

polgreen Nov 20, 2017

martin-cs Nov 21, 2017

polgreen commented Nov 20, 2017

polgreen commented Nov 21, 2017

martin-cs commented Nov 21, 2017

Aggressive slicer #1587

Aggressive slicer #1587

Conversation

polgreen commented Nov 14, 2017

polgreen commented Nov 14, 2017

smowton commented Nov 14, 2017

martin-cs commented Nov 15, 2017

martin-cs commented Nov 17, 2017

martin-cs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polgreen commented Nov 20, 2017

polgreen commented Nov 21, 2017

martin-cs commented Nov 21, 2017