-
Notifications
You must be signed in to change notification settings - Fork 273
Aggressive slicer #1587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggressive slicer #1587
Conversation
Without the "digraph call_graph{" this is not parsable dot. This change makes the output dot consistent with the previous output_dot function in call_grapht
…it from grapht<T> I have rewritten call_grapht to inherit from grapht, because I think there should be fewer graph classes in CBMC. Instead of constructing the entire call graph, the reachable call graph class constructs only the reachable call graph. There is some duplication between the call_graph class and the show_call_sequences function, however the show_call_sequences function outputs the graph on the fly, instead of constructing it, which doesn't allow for further use of the graph
Michael's comments on CR-766007 highlighted things that should be changed in grapht.
Returns the list of functions on the shortest path from a src function to a destination function on the function call graph.
Automatically removes the function bodies of any function not on the shortest path in the function call graph or reachable within N function calls of the shortest path
Adds call_grapht::reachable_within_n_steps. Function is passed an unordered set of function names and a number of steps, N,, and adds any function reachable within N function calls from the original set of functions, to the set of function names
returns the source_locationt for a property irep_idt
Extends the aggressive slicer with an option --preserve-all-direct-paths This preserves all functions on direct, loop free, paths from the start function to the function containing the property, and removes the function bodies of all other functions. This significantly reduces the size of the binary whilst ensuring that no counter-examples (possible with the given unwinding limit) are missed (although the counterexamples produced may still be spurious)
The clang format patch for this pull request would change a lot of code that I haven't touched, e.g., the GOTO_INSTRUMENT_OPTIONS. Should I apply this patch or not? Is there a way to selectively run clang-format on only sections of the file that are changed? |
AFAIK no, but there is such a facility for our own linter -- see |
@polgreen clang format should only look at the code in the diff. If not then something is wrong. Please don't fix it by just applying the diff it says! |
A few general observations:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be OK.
src/analyses/call_graph.cpp
Outdated
} | ||
|
||
call_grapht::call_grapht(const goto_functionst &goto_functions) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is left over from call graph changes. Will revert
void call_grapht::output_xml(std::ostream &out) const | ||
{ | ||
for(node_indext n = 0; n < nodes.size(); n++) | ||
output_xml_node(out, n); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would asking for an output_json be unreasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not unreasonable, it should also be added to grapht, which doesn't have one either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'd be lovely.
} | ||
|
||
return result; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much work would it be to move these algorithms out into util/graph.h. It seems like this might not be the only place we want to do a shortest path or bounded depth search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a branch by Smowton that combines my call graph pull request and his, so I think I should probably incorporate this change into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. If the two of you get together on this I'd be happy to review.
} | ||
} | ||
INVARIANT(!worklist.empty(), "destination function not found"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a little more like a precondition or possibly a viable user error and thus not an invariant.
#include "call_graph.h" | ||
|
||
class reachable_call_grapht: public call_grapht | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a reasonable use of inheritance.
else | ||
call_graph.output(std::cout); | ||
call_graph.output_dot(std::cout); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some ways it would be nice to keep output as text, output_dot as dot (and requiring the dot option), output_xml as xml (and requiring the xml option) and so on.
|
||
"(aggressive-slice)" \ | ||
"(call-depth):" \ | ||
"(harness-generator):" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have missed it, but what does this option do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that shouldn't be in this pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
/// and those that contain a given irep_idt snippet | ||
/// If no properties are set by the user, we preserve all functions on | ||
/// the shortest paths to each property. | ||
class aggressive_slicert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should have it's own header / implementation file.
} | ||
while(!worklist.empty()); | ||
std::unordered_set<irep_idt, irep_id_hash> functions_reached= | ||
call_graph.reachable_functions(entry_point); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other uses of the callgraph like this? It seems finding reachable functions might be a common thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can find, this is the only bit of code that uses this using a graph.
show_call_sequences outputs the function calls, but does it on the fly without constructing the graph.
find_used_functions obviously needs to compute the reachable functions, but again does it on the fly without constructing a graph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I guess I was suggesting that it should become an API and maybe in time these / others could be encouraged to use it.
yes will write something for the manual. Have emailed you about regression tests, not sure how to write tests for goto-instrument. I have some tests ready, but I don't know how to turn them into something that fits in the test framework. RE independence: do you mean that it doesn't duplicate functionality of existing features? I implemented this because none of the existing features would scale to the code I was looking at; we tried various combinations of the reachability slicer, the full-slicer, the global-init slicer, and nothing produced a binary small enough for cbmc to analyse. Or do you mean that it will work, independently of changes to existing functionality of CBMC? I have implemented this so that when called, goto-instrument does a slice of global-inits before performing the aggressive-slice and a reachability-slice afterwards. I could remove that feature, but it wouldn't relaly make sense to use the aggressive slicer without running the reachability slicer afterwards, as it creates a lot of unreachable code. |
There seem to be two versions of the manual: I assume I should add to the second one by editing https://github.com/diffblue/cbmc/blob/develop/doc/cprover-manual.md? |
http://www.cprover.org/cprover-manual/ is the correct one, which I believe to be generated from Regression tests : discussed offline. Independence : what I mean is "does this patch set alter the functionality of existing code or only add to it". If it is clearly only new functionality (which, one would hope, is different and distinct to existing things) then it is easier to review as we don't have to worry about breaking things. |
This pull request is dependent on:
#1586
#1509
The aggressive slicer by default removes function bodies of any functions not on the shortest path on the call graph between the start function and the property. If no property is specified, it will preseve the shortest path for each property. It is parameterisable:
--call-depth " preserves all functions within function calls of the shorets path " --preserve-function forces the aggressive slicer to preserve function
--preserve-function containing force the aggressive slicer to preserve all functions with names containing
--preserve-all-direct-paths force the aggressive slicer to preserve all functions on direct paths to the property. This option must be used with a specified property.
The aggressive slicer is designed to be used in conjunction with:
#1585 - to over-approximate any function bodies we have removed
#1566 - to block functions we do not wish to be included in paths
The results are not sound, it may produce spurious traces, due to over-approximating the removed function bodies, and it may miss traces due to the removed function bodies writing to global variables, or if "preserve-all-direct-paths" is not used. However, it is designed for use with code bases that are far too big to otherwise use cbmc on.
The parameterisation is intended to be used as a means of providing engineer feedback.