Reachability slicer: mark reachable code more precisely #2195

smowton · 2018-05-17T16:45:36Z

The reachability slicer currently uses a very simple graph walk, and in particular walks out of a function to all possible callers, regardless of whether we know the actual caller. This commit fixes that shortcoming by adding the callsite successor at the callsite, and tracking the fact that the callee's successor has already been taken care of in the graph search stack, thus allowing it to ignore the END_FUNCTION ->
allsite edges which are less precise.

Functions whose caller is genuinely unknown, such as the root function containing a reachability target
e.g. assert instruction) are treated as before, considering all possible callees. The backwards search is improved similarly to the forwards.

This needs some tests of course, but is ready to review in the meantime.

smowton · 2018-05-17T16:47:13Z

Tagging @thk123 for interest, as I believe deeptest has been using the reachability slicer to some degree?

martin-cs · 2018-05-17T18:41:09Z

Interesting... I ran into a similar effect when working on the history awareness for the abstract interpreter. Useful and worth doing.

tautschnig · 2018-05-17T21:50:56Z

src/goto-instrument/reachability_slicer.cpp

-  std::vector<cfgt::node_indext> reachable = cfg.get_reachable(src, false);
-  for(const auto index : reachable)
-    cfg[index].reaches_assertion = true;
+  std::vector<std::pair<cfgt::node_indext, bool>> stack;


Rather than having to introduce this comment I'd suggest you define a struct with members with meaningful names. That should make the code a lot easier to understand.

tautschnig · 2018-05-17T21:51:56Z

src/goto-instrument/reachability_slicer.cpp

-  std::vector<cfgt::node_indext> src = get_sources(is_threaded, criterion);
+  // Stack entries are pairs of node indices and a boolean flag that indicates
+  // whether the function's callsite is known, in which case it has already
+  // been placed on the stack and return sites are a no-op.


Copy&paste emphasizes my above point...

tautschnig · 2018-05-17T21:54:30Z

src/goto-instrument/reachability_slicer.cpp

+
+  while(!stack.empty())
+  {
+    auto index = stack.back().first;


const or you might as well spare us this one and right away do auto &node = cfg[stack.back().first];

tautschnig · 2018-05-17T22:00:51Z

src/goto-instrument/reachability_slicer.cpp

+      {
+        stack.emplace_back(edge.first, true);
+        stack.emplace_back(
+          cfg.entry_map[std::prev(node.PC)], callsite_is_known);


The use of std::prev is neither safe (as the Travis failure shows) nor sound: an instruction may have multiple incoming edges on the CFG. I believe you will need to go via the incoming edges on the CFG rather than just taking the predecessor in the list of goto program instructions.

It may, but that imprecision is exactly what I want to eliminate. If we entered this function by way of a particular callsite (indicated by callsite_is_known) then we should not further explore all possible callers, only the one that's actually reachable.

The problem was simply taking std::prev of the head of a function -- I've fixed this using std::next(pred_it) instead.

marek-trtik

The use of prev at line 85 might indeed be an issue. Otherwise, looks good to me.

Question: In what situations a Boolean flag (associated with a CFG node index) differs from result of checking whether node.PC->function is in the set of functions of get_sources(is_threaded, criterion);?

marek-trtik · 2018-05-18T10:25:11Z

src/goto-instrument/reachability_slicer.cpp

+    {
+      // Queue the instruction's natural successor (function head, or next
+      // instruction if the function is bodyless)
+      INVARIANT(node.out.size() == 1, "Call sites should have one successor");


Is this true also for GOTO obtained from a C program (e.g. a call via a pointer)?

cfgt requires that remove_function_pointer has run (it ignores function calls that don't call an ID_symbol)

smowton · 2018-05-18T10:47:56Z

@tautschnig changes applied, please re-review

smowton · 2018-05-18T10:51:52Z

@marek-trtik you could have !caller_is_known if you have something like:

unreachable() { h(); }
f2() { g(); }
f1() { g(); }
g() { h(); i(); }
h() {}
i() { assert(false); }

Here, if we start at the assert and walk backwards, caller_is_known is false stepping out of i, true
stepping into h() (so we do not walk out of h into unreachable), but then false again stepping out of g (we do step out into both f1 and f2)

marek-trtik · 2018-05-18T11:18:04Z

@smowton Thank you!

tautschnig · 2018-05-18T11:07:02Z

src/goto-instrument/reachability_slicer.cpp

+      {
+        // This is an end-of-function -> successor-of-callsite edge.
+        // Thus node.PC must have a predecessor instruction which is the calling
+        // instruction. Queue both the caller and the end of the callee.


Please add an invariant checking that indeed this predecessor is a function call. Essentially the middle sentence should not be part of a comment, but instead be spelled out as an invariant.

tautschnig · 2018-05-18T11:13:24Z

src/goto-instrument/reachability_slicer.cpp

+      {
+        // Skip this predecessor, unless this is a bodyless function, or we
+        // don't know who our callee was:
+        if(!caller_is_known || std::next(pred_node.PC) == node.PC)


This is a comparison of iterators possible across different lists of instructions. Use std::next(pred_node.PC)->function == node.PC->function instead.

Actually I need to test both (recursion), but I now check ->function first for this reason.

True, but do you actually need to care about recursion? The function will have been marked anyway.

Hmmm, true. I'm inclined to keep the "exact" check anyhow since it's clearer what is intended however.

tautschnig · 2018-05-18T11:14:48Z

src/goto-instrument/reachability_slicer.cpp

+  while(!stack.empty())
+  {
+    auto &node = cfg[stack.back().node_index];
+    auto caller_is_known = stack.back().caller_is_known;


tautschnig · 2018-05-18T11:15:16Z

src/goto-instrument/reachability_slicer.cpp

+  while(!stack.empty())
+  {
+    auto &node = cfg[stack.back().node_index];
+    auto caller_is_known = stack.back().caller_is_known;


tautschnig · 2018-05-18T11:15:48Z

src/goto-instrument/reachability_slicer.cpp

+      // Queue the instruction's natural successor (function head, or next
+      // instruction if the function is bodyless)
+      INVARIANT(node.out.size() == 1, "Call sites should have one successor");
+      auto successor_index = node.out.begin()->first;


tautschnig · 2018-05-18T11:16:56Z

src/goto-instrument/reachability_slicer.cpp

+      // have already taken care of the return site.
+      const auto &callee_head_node = cfg[successor_index];
+      auto callee_it = callee_head_node.PC;
+      if(callee_it != std::next(node.PC))


Might be across different lists, use callee_it->function != std::next(node.PC)->function instead.

tautschnig · 2018-05-18T11:17:19Z

src/goto-instrument/reachability_slicer.cpp

+        while(!callee_it->is_end_function())
+          ++callee_it;
+
+        if(cfg[cfg.entry_map[callee_it]].out.size() != 0)


Use .empty()

tautschnig · 2018-05-18T11:20:23Z

src/goto-instrument/reachability_slicer.cpp

+    {
+      for(const auto &edge : node.out)
+        stack.emplace_back(edge.first, caller_is_known);
+    }


I'd merge those two cases:

else if(!node.PC->is_end_function() || !caller_is_known) { for(const auto &edge : node.out) stack.emplace_back(edge.first, caller_is_known); }

I've merged them, though not quite like that.

smowton · 2018-05-18T12:42:00Z

@tautschnig changes applied.

tautschnig · 2018-05-19T09:58:36Z

Approved, but a regression test would be very nice to have.

The reachability slicer currently uses a very simple graph walk, and in particular walks out of a function to all possible callers, regardless of whether we know the actual caller. This commit fixes that shortcoming by adding the callsite successor *at the callsite*, and tracking the fact that the callee's successor has already been taken care of in the graph search stack, thus allowing it to ignore the END_FUNCTION -> callsite edges which are less precise. Functions whose caller is genuinely unknown, such as the root function containing a reachability target (e.g. assert instruction) are treated as before, considering all possible callees. The backwards search is improved similarly to the forwards.

smowton · 2018-05-21T09:32:52Z

@tautschnig added, and indeed found a bug by doing it.

tautschnig · 2018-05-21T10:01:03Z

added, and indeed found a bug by doing it.

Thanks! Now I'm curious what that bug was, because I clearly missed it while doing the reviews!

smowton · 2018-05-21T11:25:31Z

if(!cfg[cfg.entry_map[callee_it]].out.empty()) was incorrectly written as if(cfg[!cfg.entry_map[callee_it]].out.empty()), which happens to typecheck but is nonsense.

31ef182 Merge pull request diffblue#2208 from diffblue/cleanout-goto-programs 0e84d3e Merge pull request diffblue#2195 from smowton/smowton/feature/smarter-reachability-slicer c801126 Merge pull request diffblue#2209 from diffblue/travis-no-osx-g++ 3487bf7 Reachability slicer: mark reachable code more precisely 85e8451 Merge pull request diffblue#2066 from tautschnig/bv-endianness 5e5eafb Merge pull request diffblue#2191 from smowton/smowton/feature/java-fatal-assertions 8a68ab8 remove g++ build on OS X; this is identical to the clang++ build 48e5b25 Amend faulty long-string test 3f718ba Java: make null-pointer and similar checks fatal 821eb1d remove basic_blocks and goto_program_irep d87c2db Merge pull request diffblue#2203 from diffblue/typed-lookup d77dea3 strongly type the lookup() methods in namespacet 2382f27 Merge pull request diffblue#2193 from martin-cs/feature/correct-instruction-documentation c314272 Merge pull request diffblue#2181 from diffblue/format-expr-constants 514a0a5 Merge pull request diffblue#2167 from diffblue/parse_unwindset_options 0102452 move escape(s) to string_utils f1b6174 use unwindsett in goto-instrument dcc907a introduce unwindsett for unwinding limits 10aeae8 format_expr now does index, c_bool constants, string constants 92b92d6 Correct the documentation of ASSERT : it does not alter control flow. a11add8 Expand the documentation of the goto-program instructions. a06503b Merge pull request diffblue#2197 from tautschnig/fix-help 05e4bc3 Remove stray whitespace previously demanded by clang-format 3261f4d Fix help output of --generate-function-body-options 7c67b23 Merge pull request diffblue#2110 from tautschnig/type-mismatch-exception 18fb262 Merge pull request diffblue#2025 from tautschnig/stack-depth-fix 9191b6a Merge pull request diffblue#2199 from tautschnig/simplifier-typing f99e631 Merge pull request diffblue#2198 from tautschnig/fix-test-desc 1a79a11 stack depth instrumentation: __CPROVER_initialize may be empty a7690ba Merge pull request diffblue#2185 from smowton/smowton/fix/tolerate-ts18661-types fc02e8f Restore returns before writing the simplified binary 49333eb Make restore_returns handle simplified programs 46f7987 Fix perl regular expressioons in regression test descriptions 9fe432b Merge pull request diffblue#1899 from danpoe/sharing-map-catch-unit-tests 9cc6aae Merge pull request diffblue#2081 from hannes-steffenhagen-diffblue/floating_point_simplificiation da19abe Tolerate TS 18661 (_Float32 et al) types a055454 Try all rounding modes when domain is unknown 5f7bc15 Add float support to constant propagator domain 3dc2244 Move adjust_float_expressions to goto-programs 06d8bea Merge pull request diffblue#2187 from diffblue/move-convert-nondet 6d8c3fa Merge pull request diffblue#2189 from thk123/bugfix/json-parser-restart 2ad157f Merge pull request diffblue#2186 from smowton/smowton/fix/call-graph-uninit-field cd54ad7 Corrected state persistence in the json parser 4447be0 Fix uninitialised collect_callsites field in call_grapht ead0aa3 Merge pull request diffblue#2188 from smowton/smowton/fix/init-string-types-without-refine-strings 57988fc Fix String type initialisation when --refine-strings is not active 6a76aff Move convert_nondet to java_bytecode ac6eb21 Merge pull request diffblue#1858 from danpoe/dependence-graph-fix 10b8b09 Merge pull request diffblue#2011 from thomasspriggs/final_classes a154593 Merge pull request diffblue#2087 from danpoe/feature/small-map 7002909 More dependence graph tests 66263ea Make dependence graph merge() return true when abstract state changes 3aa6dca Control dependency computation fix a408423 Simplified state merging in the dependence graph 0315816 Merge pull request diffblue#2180 from thomasspriggs/json_id2string 8941567 Merge pull request diffblue#2124 from smowton/smowton/feature/fallback-function-provider cd04e70 JBMC: use simple method stubbing pass a6b2cda Add Java simple method stubbing pass ec1127c Lazy goto model: hook for driver program to replace or stub functions b6a4382 Remove `id2string` from inside calls to the `json_stringt` constructor. b673ebb Add storage of final modifier status of java classes in `java_class_typet`. a2ad909 Small map 808dade Provide suitable diagnostics for equality-without-matching-types 89cf6fe Throw appropriate exceptions when converting constraints 2ae66c2 Produce a proper error message when catching a std::runtime_error at top level e7b3ade Workaround for Visual Studio loss of CV qualifier bug 1f661f5 Move sharing map and sharing node unit tests to util 47463a3 Use std::size_t instead of unsigned in the sharing map 3e22217 Sharing map documentation e54f740 Fix sharing map compiler warnings bcc489c Move sharing map unit tests to catch 34114b8 Use a specialised endianness map for flattening git-subtree-dir: cbmc git-subtree-split: 31ef182

smowton requested review from chrisr-diffblue, martin-cs and peterschrammel as code owners May 17, 2018 16:45

smowton requested review from thk123, marek-trtik and NathanJPhillips May 17, 2018 16:46

tautschnig requested changes May 17, 2018

View reviewed changes

marek-trtik approved these changes May 18, 2018

View reviewed changes

smowton force-pushed the smowton/feature/smarter-reachability-slicer branch from 9622910 to 7512c70 Compare May 18, 2018 10:47

tautschnig reviewed May 18, 2018

View reviewed changes

smowton force-pushed the smowton/feature/smarter-reachability-slicer branch from 7512c70 to ebe0d79 Compare May 18, 2018 12:41

smowton assigned tautschnig May 19, 2018

tautschnig approved these changes May 19, 2018

View reviewed changes

tautschnig assigned smowton and unassigned tautschnig May 19, 2018

smowton force-pushed the smowton/feature/smarter-reachability-slicer branch from ebe0d79 to 3487bf7 Compare May 21, 2018 09:32

peterschrammel approved these changes May 21, 2018

View reviewed changes

smowton merged commit 0e84d3e into diffblue:develop May 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reachability slicer: mark reachable code more precisely #2195

Reachability slicer: mark reachable code more precisely #2195

smowton commented May 17, 2018 •

edited

Loading

smowton commented May 17, 2018

martin-cs commented May 17, 2018

tautschnig May 17, 2018

tautschnig May 17, 2018

tautschnig May 17, 2018

tautschnig May 17, 2018

smowton May 18, 2018

marek-trtik left a comment

marek-trtik May 18, 2018

smowton May 18, 2018

smowton commented May 18, 2018

smowton commented May 18, 2018 •

edited

Loading

marek-trtik commented May 18, 2018

tautschnig May 18, 2018

tautschnig May 18, 2018

smowton May 18, 2018

tautschnig May 18, 2018

smowton May 18, 2018

tautschnig May 18, 2018

tautschnig May 18, 2018

tautschnig May 18, 2018

tautschnig May 18, 2018

tautschnig May 18, 2018

tautschnig May 18, 2018

smowton May 18, 2018

smowton commented May 18, 2018

tautschnig commented May 19, 2018

smowton commented May 21, 2018

tautschnig commented May 21, 2018

smowton commented May 21, 2018

Reachability slicer: mark reachable code more precisely #2195

Reachability slicer: mark reachable code more precisely #2195

Conversation

smowton commented May 17, 2018 • edited Loading

smowton commented May 17, 2018

martin-cs commented May 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marek-trtik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smowton commented May 18, 2018

smowton commented May 18, 2018 • edited Loading

marek-trtik commented May 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

smowton commented May 18, 2018

tautschnig commented May 19, 2018

smowton commented May 21, 2018

tautschnig commented May 21, 2018

smowton commented May 21, 2018

smowton commented May 17, 2018 •

edited

Loading

smowton commented May 18, 2018 •

edited

Loading