Skip to content

Add support for assigns clauses on loops #6249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

aalok-thakkar
Copy link

@aalok-thakkar aalok-thakkar commented Jul 22, 2021

Resolves #6021.

In this PR, we add support for assigns clauses (__CPROVER_assigns) on loops.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@aalok-thakkar aalok-thakkar marked this pull request as draft July 22, 2021 15:22
@aalok-thakkar aalok-thakkar force-pushed the loop-contract-assigns-clauses branch 2 times, most recently from 7894298 to 6cc507e Compare July 22, 2021 15:28
@feliperodri feliperodri added aws Bugs or features of importance to AWS CBMC users Code Contracts Function and loop contracts new feature labels Jul 23, 2021
Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle seems fine but please look at the comments.

@SaswatPadhi SaswatPadhi force-pushed the loop-contract-assigns-clauses branch 2 times, most recently from 94b03f0 to 1b34d7c Compare November 9, 2021 00:24
@codecov
Copy link

codecov bot commented Nov 9, 2021

Codecov Report

Merging #6249 (2a9e3e2) into develop (6c3d1da) will increase coverage by 0.01%.
The diff coverage is 93.28%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #6249      +/-   ##
===========================================
+ Coverage    76.01%   76.02%   +0.01%     
===========================================
  Files         1546     1546              
  Lines       165274   165352      +78     
===========================================
+ Hits        125634   125711      +77     
- Misses       39640    39641       +1     
Impacted Files Coverage Δ
src/ansi-c/c_typecheck_base.h 100.00% <ø> (ø)
src/goto-instrument/contracts/contracts.h 100.00% <ø> (ø)
src/goto-instrument/contracts/utils.h 100.00% <ø> (ø)
src/ansi-c/c_typecheck_code.cpp 78.66% <80.00%> (+0.07%) ⬆️
src/ansi-c/parser.y 79.28% <83.33%> (+0.01%) ⬆️
src/goto-instrument/contracts/contracts.cpp 96.74% <96.42%> (+0.61%) ⬆️
src/ansi-c/c_typecheck_base.cpp 78.23% <100.00%> (-0.02%) ⬇️
src/goto-instrument/contracts/assigns.cpp 96.51% <100.00%> (+0.04%) ⬆️
src/goto-instrument/contracts/utils.cpp 92.85% <100.00%> (+1.02%) ⬆️
src/goto-instrument/loop_utils.cpp 91.42% <100.00%> (+0.51%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 516f109...2a9e3e2. Read the comment docs.

@SaswatPadhi SaswatPadhi force-pushed the loop-contract-assigns-clauses branch from 80ab946 to 6c6f1c9 Compare November 9, 2021 04:43
@SaswatPadhi SaswatPadhi marked this pull request as ready for review November 9, 2021 04:43
@SaswatPadhi SaswatPadhi changed the title Added support for assigns clauses on loop contracts Add support for assigns clauses on loop contracts Nov 9, 2021
@SaswatPadhi SaswatPadhi changed the title Add support for assigns clauses on loop contracts Add support for assigns clauses on loops Nov 9, 2021
@SaswatPadhi
Copy link
Contributor

@martin-cs I have addressed some of the concerns from your first review. Please take another look if you have time and suggest any improvements / more interesting tests to try out. Thanks!!

Comment on lines +54 to +57
if(mod.id() == ID_unknown)
{
throw analysis_exceptiont("Alias analysis returned UNKNOWN!");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so what?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to guarantee soundness, in this context one would have to add every visible object to the modifies set in this case (unknown == \top).

I raise an exception because I was not sure how to even do that. This is caught by the caller and processed in whatever way appropriate in the calling context. In our case, loop contract instrumentation calls this function and shows a message to the user to add an assigns clause on the loop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not strictly speaking necessary for soundness. You could chose the empty set or some random junk. The abstraction of the loop would be unsound, but the step-case check would fail because we wouldn't be able to prove that you only wrote what we said.

Having said that though, surely this would be easier for the user to debug then having them try to debug the fail of the proof that the loop didn't write to the empty set of targets.

Copy link
Contributor

@SaswatPadhi SaswatPadhi Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not strictly speaking necessary for soundness. You could chose the empty set or some random junk.

I think that would be unsound. Because an empty set would be an underapproximation and would lead to proving incorrect properties (that should actually fail). Consider for example:

a[5] = 0;
for(int i = 0; i < sz; ++i) a[i] = 1; // sz > 5
assert(a[5] == 0);

Suppose the alias analysis couldn't compute the write set of the loop, i.e, it returned unknown on the LHS a[i] for the statement a[i] = 1 in the loop body.

If we consider the write set of this loop to be the empty set, we would be able to prove the postcondition, but we actually shouldn't.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, don't we then instrument the body with that (empty) write set and fail the step-case of the proof?

Copy link
Contributor

@SaswatPadhi SaswatPadhi Nov 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see the confusion now.
We only instrument the body if an assigns clause was explicitly specified by the user. We don't resinstrument the body if the write set was automatically inferred from it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. So that write set inference is trusted code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, all internal CBMC analyses (in this case, the alias analysis) are trusted.

Comment on lines 120 to 123

add_pragma_disable_assigns_check(instructions);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100. I have no idea how this commit relates to the remainder of the PR. Does it perhaps touch code that was only introduced in this PR? If so, fold it into that commit. Where do I find tests? Why is this change being made at all?

Copy link
Collaborator

@feliperodri feliperodri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor comments about documentation. Thanks for implementing all the comments!

The write set inclusion check on havoc_object(ptr) only checked the
inclusion of (ptr+0) in the write set, but performed havocing of the
entire object (i.e., ptr+0 until ptr+OBJECT_SIZE(ptr)).
The proposed change fixes this behavior and checks the inclusion of the
entire object in the write set.
@feliperodri
Copy link
Collaborator

@chrisr-diffblue @martin-cs @peterschrammel could you take a look at this PR? It's missing code owner approval.

Copy link
Member

@peterschrammel peterschrammel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK; I am really torn on this. I don't want to hold up @SaswatPadhi or the good work done, so I have put approve BUT I think there are a number of things here that need some thought and discussion. I have put these as comments. I think the actual change is OK, it is mainly peripheral issues and bug fixes.

^SIGNAL=0$
--
--
This test fails even without the fix proposed in the commit, so it should be improved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the observable difference in this test of the fix? If none then ... can we have a test case that does have an observable difference when fixed?

Presumably it will need a short-cut operator with one operation that alters a value? What about a fancy for-else loop like this:

int i = 0;
int i_valid_element = 1;

while (i < limit || i_valid_element = 0) {
  if (a[i] == target) {
    break;
  }
  ++i;
}

if (!i_valid_element) {
  realloc(a, ... );
  a[i] = target;
}

return i;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the observable difference in this test of the fix?

The difference is that the old value of the loop variant was recorded in the middle of loop head instructions. Simply doing std::next(loop_head) isn't sufficient to get to the body, where it should be recorded.

The --show-goto-functions on the test would show the difference, but there is no observation failure in the CBMC output. I could try to come up with a test where it would lead to a failure, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have the time / energy it would be nice. Adding test cases that don't have an observable difference is always a little iffy. I appreciate that the test case you add would cover the source lines though.

It is pretty synthetic but...

uint64_t cookie = 0;
int connection_working = 0;
while (!connection_working && (cookie != 0 || cookie = get_new_cookie())) {
  connection_working = connect(cookie);
}

get_new_cookie() has a post condition that cookie is valid (i.e. non-zero) and connect() has a pre condition that cookie is valid as well. It is reasonable for the user to think that cookie is guaranteed to be valid and invariant in the loop body.

// This is necessary because complex C loop_head conditions could be
// converted to multiple GOTO instructions (e.g. temporaries are introduced).
const auto head_loc = loop_head->source_location();
while(loop_head->source_location() == head_loc)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A. Frankly, this makes me nervous. I don't think we have any (documented, checked, folklore, etc.) invariants about source_location and making it semantically meaningful. There are a few places that do, things like the coverage code (I think @peterschrammel is the expert here) but ... I am concerned about this.

B. I can't think of a better way of distinguishing:

while (a && b) {
...
}

from

while (a) {
 if (! b) break;
}

I guess they produce the same goto_programt?

C. Do we need some common way of marking where the loop head / body is? Should this code go into the natural_loops detection code? Do we need a marker instruction? When working on other analysis tools the question of "what is 'in a loop'" has come up and it is a lot more complex than it first appears.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sharing some background discussion in here: I have the very same concern, and hence asked for some test to be added (the one you were wondering about) just so that at least we can investigate afterwards. We will likely have to rewrite some of this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having one consistently used interface for identifying loops, the exit conditions, the exit locations, the "body", etc. would be great. Note that the syntactic and semantic definitions of loops are different.

I am happy for this to be flagged as a future issue as I think the rest of the code base is not exactly consistent on this.

havoc_if_validt havoc_gen(modifies, ns);
havoc_gen.append_full_havoc_code(loop_head->source_location(), havoc_code);
modifiest modifies;
if(assigns.is_nil())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 what he said.

The loop variant value is supposed to be recorded just after the loop
head (i.e., at the beginning of the loop body). Previously this was
done by computing std::next(loop_head). However, complex C loop guards
could be compiled to multiple GOTO instructions and it might be
necessarey to advance the loop_head multiple times.

The proposed change advanced the loop_head instruction pointer until the
source_location differs. This approach might still fail if the loop
guard was split across multiple lines in the source file.
Whether the first (unguarded) loop iteration should be executed "before"
the base case assertion or not is still under discussion.
Loop contracts on do/while loops are rejected until this is resolved.
In this commit we implement the parsing actions, type checking and
conversion routines for assigns clauses on loops. The syntax rules along
with regressions tests were introduced in an earlier PR: diffblue#6196.
This commit introduces a new internal pragma, called
`contracts:disable:assigns-check`, and annotates statements that have
already been instrumented with inclusion checks to avoid instrumenting
them again.
@SaswatPadhi SaswatPadhi force-pushed the loop-contract-assigns-clauses branch from 0ee3501 to 2a9e3e2 Compare November 23, 2021 17:04
@SaswatPadhi SaswatPadhi merged commit e6b3118 into diffblue:develop Nov 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users aws-high Code Contracts Function and loop contracts new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash in enforcing loop invariant contract due to imprecise alias analysis
8 participants