-
Notifications
You must be signed in to change notification settings - Fork 274
Add support for assigns clauses on loops #6249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for assigns clauses on loops #6249
Conversation
7894298
to
6cc507e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle seems fine but please look at the comments.
94b03f0
to
1b34d7c
Compare
Codecov Report
@@ Coverage Diff @@
## develop #6249 +/- ##
===========================================
+ Coverage 76.01% 76.02% +0.01%
===========================================
Files 1546 1546
Lines 165274 165352 +78
===========================================
+ Hits 125634 125711 +77
- Misses 39640 39641 +1
Continue to review full report at Codecov.
|
80ab946
to
6c6f1c9
Compare
@martin-cs I have addressed some of the concerns from your first review. Please take another look if you have time and suggest any improvements / more interesting tests to try out. Thanks!! |
if(mod.id() == ID_unknown) | ||
{ | ||
throw analysis_exceptiont("Alias analysis returned UNKNOWN!"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, to guarantee soundness, in this context one would have to add every visible object to the modifies
set in this case (unknown == \top).
I raise an exception because I was not sure how to even do that. This is caught by the caller and processed in whatever way appropriate in the calling context. In our case, loop contract instrumentation calls this function and shows a message to the user to add an assigns clause on the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not strictly speaking necessary for soundness. You could chose the empty set or some random junk. The abstraction of the loop would be unsound, but the step-case check would fail because we wouldn't be able to prove that you only wrote what we said.
Having said that though, surely this would be easier for the user to debug then having them try to debug the fail of the proof that the loop didn't write to the empty set of targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not strictly speaking necessary for soundness. You could chose the empty set or some random junk.
I think that would be unsound. Because an empty set would be an underapproximation and would lead to proving incorrect properties (that should actually fail). Consider for example:
a[5] = 0;
for(int i = 0; i < sz; ++i) a[i] = 1; // sz > 5
assert(a[5] == 0);
Suppose the alias analysis couldn't compute the write set of the loop, i.e, it returned unknown
on the LHS a[i]
for the statement a[i] = 1
in the loop body.
If we consider the write set of this loop to be the empty set, we would be able to prove the postcondition, but we actually shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, don't we then instrument the body with that (empty) write set and fail the step-case of the proof?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see the confusion now.
We only instrument the body if an assigns clause was explicitly specified by the user. We don't resinstrument the body if the write set was automatically inferred from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. So that write set inference is trusted code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly, all internal CBMC analyses (in this case, the alias analysis) are trusted.
regression/contracts/invar_havoc_dynamic_multi-dim_array_all_const_idx/test.desc
Show resolved
Hide resolved
|
||
add_pragma_disable_assigns_check(instructions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+100. I have no idea how this commit relates to the remainder of the PR. Does it perhaps touch code that was only introduced in this PR? If so, fold it into that commit. Where do I find tests? Why is this change being made at all?
0dbab2a
to
0ee3501
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor comments about documentation. Thanks for implementing all the comments!
regression/contracts/invar_havoc_dynamic_multi-dim_array_partial_const_idx/test.desc
Show resolved
Hide resolved
The write set inclusion check on havoc_object(ptr) only checked the inclusion of (ptr+0) in the write set, but performed havocing of the entire object (i.e., ptr+0 until ptr+OBJECT_SIZE(ptr)). The proposed change fixes this behavior and checks the inclusion of the entire object in the write set.
@chrisr-diffblue @martin-cs @peterschrammel could you take a look at this PR? It's missing code owner approval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK; I am really torn on this. I don't want to hold up @SaswatPadhi or the good work done, so I have put approve BUT I think there are a number of things here that need some thought and discussion. I have put these as comments. I think the actual change is OK, it is mainly peripheral issues and bug fixes.
^SIGNAL=0$ | ||
-- | ||
-- | ||
This test fails even without the fix proposed in the commit, so it should be improved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the observable difference in this test of the fix? If none then ... can we have a test case that does have an observable difference when fixed?
Presumably it will need a short-cut operator with one operation that alters a value? What about a fancy for-else
loop like this:
int i = 0;
int i_valid_element = 1;
while (i < limit || i_valid_element = 0) {
if (a[i] == target) {
break;
}
++i;
}
if (!i_valid_element) {
realloc(a, ... );
a[i] = target;
}
return i;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the observable difference in this test of the fix?
The difference is that the old value of the loop variant was recorded in the middle of loop head instructions. Simply doing std::next(loop_head)
isn't sufficient to get to the body, where it should be recorded.
The --show-goto-functions
on the test would show the difference, but there is no observation failure in the CBMC output. I could try to come up with a test where it would lead to a failure, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have the time / energy it would be nice. Adding test cases that don't have an observable difference is always a little iffy. I appreciate that the test case you add would cover the source lines though.
It is pretty synthetic but...
uint64_t cookie = 0;
int connection_working = 0;
while (!connection_working && (cookie != 0 || cookie = get_new_cookie())) {
connection_working = connect(cookie);
}
get_new_cookie()
has a post condition that cookie
is valid (i.e. non-zero) and connect()
has a pre condition that cookie
is valid as well. It is reasonable for the user to think that cookie is guaranteed to be valid and invariant in the loop body.
// This is necessary because complex C loop_head conditions could be | ||
// converted to multiple GOTO instructions (e.g. temporaries are introduced). | ||
const auto head_loc = loop_head->source_location(); | ||
while(loop_head->source_location() == head_loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A. Frankly, this makes me nervous. I don't think we have any (documented, checked, folklore, etc.) invariants about source_location
and making it semantically meaningful. There are a few places that do, things like the coverage code (I think @peterschrammel is the expert here) but ... I am concerned about this.
B. I can't think of a better way of distinguishing:
while (a && b) {
...
}
from
while (a) {
if (! b) break;
}
I guess they produce the same goto_programt
?
C. Do we need some common way of marking where the loop head / body is? Should this code go into the natural_loops
detection code? Do we need a marker instruction? When working on other analysis tools the question of "what is 'in a loop'" has come up and it is a lot more complex than it first appears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing some background discussion in here: I have the very same concern, and hence asked for some test to be added (the one you were wondering about) just so that at least we can investigate afterwards. We will likely have to rewrite some of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having one consistently used interface for identifying loops, the exit conditions, the exit locations, the "body", etc. would be great. Note that the syntactic and semantic definitions of loops are different.
I am happy for this to be flagged as a future issue as I think the rest of the code base is not exactly consistent on this.
havoc_if_validt havoc_gen(modifies, ns); | ||
havoc_gen.append_full_havoc_code(loop_head->source_location(), havoc_code); | ||
modifiest modifies; | ||
if(assigns.is_nil()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 what he said.
The loop variant value is supposed to be recorded just after the loop head (i.e., at the beginning of the loop body). Previously this was done by computing std::next(loop_head). However, complex C loop guards could be compiled to multiple GOTO instructions and it might be necessarey to advance the loop_head multiple times. The proposed change advanced the loop_head instruction pointer until the source_location differs. This approach might still fail if the loop guard was split across multiple lines in the source file.
Whether the first (unguarded) loop iteration should be executed "before" the base case assertion or not is still under discussion. Loop contracts on do/while loops are rejected until this is resolved.
In this commit we implement the parsing actions, type checking and conversion routines for assigns clauses on loops. The syntax rules along with regressions tests were introduced in an earlier PR: diffblue#6196.
This commit introduces a new internal pragma, called `contracts:disable:assigns-check`, and annotates statements that have already been instrumented with inclusion checks to avoid instrumenting them again.
0ee3501
to
2a9e3e2
Compare
Resolves #6021.
In this PR, we add support for assigns clauses (
__CPROVER_assigns
) on loops.