-
Notifications
You must be signed in to change notification settings - Fork 274
CONTRACTS: Several fixes for loop contracts #6701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CONTRACTS: Several fixes for loop contracts #6701
Conversation
8cd75d5
to
973319d
Compare
Codecov Report
@@ Coverage Diff @@
## develop #6701 +/- ##
=========================================
Coverage 77.15% 77.15%
=========================================
Files 1582 1582
Lines 182628 182759 +131
=========================================
+ Hits 140900 141013 +113
- Misses 41728 41746 +18
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare all instrumentation DEAD after the loop, add doc on the invariant_expr() closure, use distinct variables for generated_code
before and after loop to make it easier to understand what is going on. Keep comments in sync with the implementation.
@@ -187,26 +187,7 @@ void code_contractst::check_apply_loop_contracts( | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random place for comment:
At line 150-154 is this method to detect the loop end really reliable reliable ? It seems that the result returned by natural_loops
already contains for each loop the list of instructions forming the loop.
goto_programt::targett loop_end = loop_head;
for(const auto &t : loop)
if(
t->is_goto() && t->get_target() == loop_head &&
t->location_number > loop_end->location_number)
loop_end = t;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it captures a "set" of all the instructions (not "list"), but also, we need to find the "loop_end" instruction where we have a back edge (so that we can remove it).
As to whether this loop is reliable for finding the "loop_end", I think it is? Do you see a case where this may not work?
Other comment: the havoc statements do not contain the local static variables |
b3de959
to
4df78a2
Compare
ffa64b2
to
75e9ab8
Compare
// FIXME: this is not true for write set targets that | ||
// might depend on other write set targets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to check if we are in this case with the specified or inferred targets ?
And is widening now applied right after the targets have been inferred or still during havocing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The widening part is not yet fixed.
I could add a separate commit to fix this PR, but is this still high priority given that we double check all inferred targets anyway by instrumenting the loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now loop contracts still use the havoc_assigns_targetst
class because it contains the widening logic.
Widening right after the inference would allow to use the new class havoc_assigns_clause_targetst
which shares the CAR snapshotting logic with assigns clause checking.
8f543a2
to
901a006
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, no blocking comments.
log.error() << "Cannot enforce contracts on irregular loop at: " | ||
<< loop_head->source_location() | ||
<< ".\nLoop body must be a contiguous set of instructions." | ||
<< messaget::eom; | ||
log.debug() << "Out-of-loop-body instruction:" << messaget::eom; | ||
goto_function.body.output_instruction(ns, function_name, log.debug(), *i); | ||
throw 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not covered by regression tests because is design to catch inconsistent GOTO programs? I'd add a comment if that's true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly. I tried to, but couldn't write loops in C that would trigger this.
I will add a comment.
if(!loop_nesting_graph[outer].content.contains( | ||
loop_nesting_graph[inner].end_target)) | ||
{ | ||
log.error() | ||
<< "Overlapping loops at:\n" | ||
<< loop_nesting_graph[outer].head_target->source_location() | ||
<< "\nand\n" | ||
<< loop_nesting_graph[inner].head_target->source_location() | ||
<< "\nLoops must be nested or sequential for contracts to be " | ||
"enforced." | ||
<< messaget::eom; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not covered by regression tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can write loops in C which would trigger this check. It can only be triggered if someone messes with the goto instructions directly and introduces some statements in the middle of a loop.
We should have this check in place though because our instrumentation code assumes this property and would be unsound if two loop bodies somehow overlap.
to_check_contracts_on_children.push_back(idx); | ||
|
||
// If the loop has contracts to be enforced, then | ||
// reject irregular loops that have "holes" within their body, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really have a notion of 'irregular' loop. Could we rephrase the comment and move this check in a separate method ?
We could add a doc like this for the method :
By definition the loop body is the set of instructions computed by `natural_loops` based on the CFG, and here we check that straight line code or instructions from different loops do not mix and overlap in the serialized representation.
This check is needed because we perform the instrumentation by traversing the sequence of instructions and we want to make sure we are only going to instrument loop instructions or function exit instructions.
So we check that all instructions found in sequential order between loop_head and loop_end belong to the set of instructions computed by the natural_loop analysis, while allowing the presence of:
- `assume(false)` instructions which are introduced by function pointer or nested loop transformations, and have no successor instructions
- `SET_RETURN_VALUE` instructions followed by an uninterrupted sequence of `DEAD` instructions and a `GOTO` instruction that jumps out of the loop, which model return statements in GOTO.
If in addition we could check that instructions are in topological order this would prove that this is equivalent to traversing the CFG to do the instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move the loop dependency graph construction and loop instruction checking to a dedicated method with documentation, and add clarification on the case when a loop does not have a contract and we skip instrumentation (L1167)
+ Disabled loop contract instrumentation along program paths that result in vacuous loops, i.e., when the loop guard doesn't hold initially and the loop never runs. This allows for significantly simpler loop contracts in some cases. + Fixed the instrumentation for decreases clauses. It no longer requires the `source_location` hack to find the beginning of loop "body". + Added some regression tests for loops that have side effects within guards
Our loop contracts instrumentation routine makes some assumptions regarding the structure of loops. In this commit, I have added checks to verify them before we start instrumenting, so as to avoid potentially unsound instrumentation.
901a006
to
af24259
Compare
Our assigns clause instrumentation works over a sequence of instructions ignoring control flow edges. In diffblue#6701 a safety check was added to bail out on attempting to instrument loops that contain "holes" -- instructions that appear within the [head, end] sequence for the loop, but are not part of the computed "natural loop" because of control flow. In this PR, we remove this restriction and allow instrumenting such loops by masking out instructions outside of the computed natural loop.
Our assigns clause instrumentation works over a sequence of instructions ignoring control flow edges. In diffblue#6701 a safety check was added to bail out on attempting to instrument loops that contain "holes" -- instructions that appear within the [head, end] sequence for the loop, but are not part of the computed "natural loop" because of control flow. In this PR, we remove this restriction and allow instrumenting such loops by masking out instructions outside of the computed natural loop.
The main change in this PR is that loop contract instrumentation is now disabled along program paths that result in vacuous loop (i.e., the loop guard doesn't hold initially and the loop never runs).
At a high level, this is equivalent to instrumenting a:
loop as:
instead. The instrumented code would not be triggered if
G
was initiallyfalse
and we never enter the loop.This allows for significantly simpler loop contracts. A simple example is attached as a regression test in this PR.
Things get a little tricky when
G
can have side effects, and the instrumented code now jumps back and forth to avoid replicating the entireG
. I have added several comments to clarify the instrumentation process. We check all side effects in loop guards as well against the loop assigns clauses (see attached regression tests).Finally, this also allows us to move the instrumentation for decreases clauses to before the loop head (i.e., out of the loop "body"), so we no longer require the
source_location
hack to figure out where the loop guard ends and body starts.The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/My commit message includes data points confirming performance improvements (if claimed).