-
Notifications
You must be signed in to change notification settings - Fork 273
CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC #7671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CONTRACTS: class dfcc_cfg_infot: loops and identifier scoping for DFCC #7671
Conversation
ba28654
to
4e65b80
Compare
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## develop #7671 +/- ##
===========================================
- Coverage 78.51% 78.51% -0.01%
===========================================
Files 1674 1674
Lines 191946 191940 -6
===========================================
- Hits 150715 150709 -6
Misses 41231 41231
☔ View full report in Codecov by Sentry. |
4d8ba1c
to
75b519a
Compare
75b519a
to
2071d5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this first round I have just taken a look at the interface, not the implementation. I believe this needs to be changed.
const irep_idt &function_id, | ||
goto_functiont &goto_function, | ||
const exprt &top_level_write_set, | ||
const dfcc_loop_contract_modet loop_contract_mode, | ||
goto_modelt &goto_model, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are taking a function_id
, a goto_function
, and a goto_model
. How do these relate? I'd hope that the goto_function
is the value where function_id
is the key in goto_model
's function map. If so, please omit the goto_function
parameter and compute goto_function
. Else you are just asking for inconsistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the parameter goto_function
should be removed. I will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are places in contract instrumentation where we instrument goto programs snippets that will eventually land in the function named function_id
but are not there yet. These snippets are temporarily embedded in dummy goto functions because natural loop or may alias functions only accept goto_functions as inputs.
These goto functions are not found in the function table of the goto model. Fresh symbols generated when instrumenting these dummy goto functions have to use function_id
as base name so it needs to be passed around, so does the goto model because it carries the symbol table which is needed when creating fresh symbols.
Making the suggested change will force us to rewrite large parts of the instrumentation code in dfcc_instrumentt
and dfcc_wrapper_functiont
, or to temporarily insert/remove dummy goto functions in the goto model just to be able to pass the goto model and the key instead of the goto model, the key and the goto function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, @remi-delmas-3000 is correct. And removing goto_function
will break the tests. I will undo the fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to fix this. The DFCC code is becoming inscrutable because of undocumented hacks of this kind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you look at dfcc_instrumentt
in #7541 you'll see the different entry points for instrumentation, hopefully that makes things clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed goto_model
and usesymbol_table
instead to make it less confusing.
bfc6f6c
to
39a0a11
Compare
{ | ||
// This is a complex expression. Compute the set of root objects expressions | ||
// it can assign. | ||
auto root_objects = dfcc_root_objects(lhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about you always do this? For a symbol expression, dfcc_root_objects
will immediately return just that symbol. I think it would reduce the code complexity here, and also make behaviour more consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in the else branch, so it is not a symbol expression ... What do you mean by "more consistent" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you could do auto root_objects = dfcc_root_objects(lhs);
at the beginning of this function and then the if
branch that currently exists at the top would be eliminated for root_objects
would equally cover symbol expressions (you would have exactly one entry in root_idents
below). I would then expect that all the use of tracked
and local
be consistent, which is not obvious from the code in the if
case above: it doesn't even look at tracked
, and I don't know why this is ok to do. Conversely, this else
branch never does dfcc_is_cprover_symbol
, and again I don't know why this is ok (not) to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've modified this a bit
39a0a11
to
428d542
Compare
428d542
to
2ae2d4a
Compare
std::unordered_set<irep_idt> loop_tracked; | ||
for(const auto &ident : loop_local) | ||
{ | ||
if(dirty(ident)) | ||
{ | ||
loop_tracked.insert(ident); | ||
} | ||
else | ||
{ | ||
// Check if this ident is touched by one of the inner loops | ||
for(const auto inner_loop_id : | ||
loop_nesting_graph.get_predecessors(loop_id)) | ||
{ | ||
if(is_assigned(ident, loop_info_map.at(inner_loop_id).assigns)) | ||
loop_tracked.insert(ident); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece of code is now repeated twice (with the necessary local modifications) and should be factored out into a method or function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factored
exprt::operandst invariant_clauses = get_invariants(loop.latch); | ||
exprt::operandst decreases_clauses = get_decreases(loop.latch); | ||
exprt::operandst assigns_clauses = get_assigns(loop.latch); | ||
std::set<exprt> assigns; | ||
exprt invariant_expr = nil_exprt(); | ||
|
||
// Generate defaults for all clauses if some clause is defined | ||
if( | ||
!invariant_clauses.empty() || !decreases_clauses.empty() || | ||
!assigns_clauses.empty()) | ||
{ | ||
if(invariant_clauses.empty()) | ||
{ | ||
// use a default invariant if none given. | ||
invariant_expr = true_exprt{}; | ||
// assigns clause is missing; we will try to automatic inference | ||
log.warning() << "No invariant provided for loop " << function_id << "." | ||
<< loop.latch->loop_number << " at " | ||
<< loop.head->source_location() | ||
<< ". Using 'true' as a sound default invariant. Please " | ||
"provide an invariant the default is too weak." | ||
<< messaget::eom; | ||
} | ||
else | ||
{ | ||
// conjoin all invariant clauses | ||
invariant_expr = conjunction(invariant_clauses); | ||
} | ||
|
||
// unpack assigns clause targets | ||
if(!assigns_clauses.empty()) | ||
{ | ||
for(auto &operand : assigns_clauses) | ||
{ | ||
assigns.insert(operand); | ||
} | ||
} | ||
else | ||
{ | ||
// infer assigns clause targets if none given | ||
auto inferred = dfcc_infer_loop_assigns( | ||
local_may_alias, loop.instructions, loop.head->source_location(), ns); | ||
log.warning() << "No assigns clause provided for loop " << function_id | ||
<< "." << loop.latch->loop_number << " at " | ||
<< loop.head->source_location() << ". The inferred set {"; | ||
bool first = true; | ||
for(const auto &expr : inferred) | ||
{ | ||
if(!first) | ||
{ | ||
log.status() << ", "; | ||
} | ||
first = false; | ||
log.status() << format(expr); | ||
} | ||
log.status() << "} might be incomplete or imprecise, please provide an " | ||
"assigns clause if the analysis fails." | ||
<< messaget::eom; | ||
assigns.swap(inferred); | ||
} | ||
|
||
if(decreases_clauses.empty()) | ||
{ | ||
log.warning() << "No decrease clause provided for loop " << function_id | ||
<< "." << loop.latch->loop_number << " at " | ||
<< loop.head->source_location() | ||
<< ". Termination will not be checked." << messaget::eom; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this deserves factoring out into a method of its own. I think it would just need to take loop
as argument, and could return assigns/invariant_expr/decreases_clauses via a struct
that has those three members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factored
{ | ||
const auto head = check_inner_loops_have_contracts(loop_nesting_graph); | ||
if(head.has_value()) | ||
{ | ||
throw invalid_source_file_exceptiont( | ||
"Found loop without contract nested in a loop with a " | ||
"contract.\nPlease " | ||
"provide a contract or unwind this loop before applying loop " | ||
"contracts.", | ||
head.value()->source_location()); | ||
} | ||
} | ||
|
||
auto topsorted = loop_nesting_graph.topsort(); | ||
for(const auto idx : topsorted) | ||
{ | ||
topsorted_loops.push_back(idx); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of this could be moved out of the if
body. Else its hard to understand why just tag_loop_instructions
has to live outside that body. (Alternatively, if my earlier suggestion of not tagging top-level instructions is implemented, you could safely move tag_loop_instructions
inside this if
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea we had here is that unless they are actually used, we do not impose the check_inner_loops_have_contracts
constraints.
Not tagging top-level instructions would have far reaching consequences outside of this PR. The tag has to exist even it trivial, as it represents a key in a total mapping from instructions to write sets, and is used by everything else to lookup the write set variables to use for instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kept logic as-is but added comment
2ae2d4a
to
eaf9f6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all the refactoring! I still have some questions/comments, but they seem rather minor now.
// tag remaining instructions as top level | ||
for(auto target = goto_program.instructions.begin(); | ||
target != goto_program.instructions.end(); | ||
target++) | ||
{ | ||
// Skip instructions that are already tagged (belong to loops). | ||
auto loop_id_opt = dfcc_get_loop_id(target); | ||
if(loop_id_opt.has_value()) | ||
{ | ||
continue; | ||
} | ||
dfcc_set_loop_id(target, top_level_id); | ||
dfcc_set_loop_top_level(target); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I keep wondering: why do we actually need to tag these? Can't we just leave them untagged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the instrumentation passes that follow expect to find a tag on each instruction, and use it as a key to query the write set variable from the dfcc_cfg_info class.
- Tagging all instructions amounts to using
size_t
as key type and a special value to represent the top-level scope. - Not tagging some instructions amounts to using
optionalt<size_t>
as key type, with none the for top level scope.
I think 1) allows for better sanity checks.
With 1), if you discover an instruction without tag then it is an error for sure. With 2), it could be because it is a top level instruction, or it could be a loop instruction that was skipped because of a bug somewhere.
{ | ||
messaget log(message_handler); | ||
|
||
const auto loop = loop_nesting_graph[loop_id]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit pick: I believe this will create a copy - don't you want a reference instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
// initialise defaults | ||
struct contract_clausest result; | ||
result.invariant_expr = nil_exprt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good candidate for a (default) constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
bool first = true; | ||
for(const auto &expr : inferred) | ||
{ | ||
if(!first) | ||
{ | ||
log.status() << ", "; | ||
} | ||
first = false; | ||
log.status() << format(expr); | ||
} | ||
log.status() << "} might be incomplete or imprecise, please provide an " | ||
"assigns clause if the analysis fails." | ||
<< messaget::eom; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to use log.conditional_output(log.status(), [&](messaget::mstreamt &mstream) { ... < the above code > }
to make clear that all of this is just about logging. (I usually recommend this for debug output to improve performance; here it's more about readability.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should actually be all printed as a warning.
auto loop_id = loop_id_opt.value(); | ||
if(is_top_level_id(loop_id)) | ||
{ | ||
return top_level_write_set; | ||
} | ||
else | ||
{ | ||
return loop_info_map.at(loop_id).addr_of_write_set_var; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So does the above code still require use of one of those methods?
{ | ||
// This is a complex expression. Compute the set of root objects expressions | ||
// it can assign. | ||
auto root_objects = dfcc_root_objects(lhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you could do auto root_objects = dfcc_root_objects(lhs);
at the beginning of this function and then the if
branch that currently exists at the top would be eliminated for root_objects
would equally cover symbol expressions (you would have exactly one entry in root_idents
below). I would then expect that all the use of tracked
and local
be consistent, which is not obvious from the code in the if
case above: it doesn't even look at tracked
, and I don't know why this is ok to do. Conversely, this else
branch never does dfcc_is_cprover_symbol
, and again I don't know why this is ok (not) to do.
eaf9f6d
to
3f2af0f
Compare
3f2af0f
to
569804b
Compare
This class extracts loop information from a goto program and:
This PR is extracted from a larger PR #7541 and can only be tested once all features are merged.