Skip to content

CONTRACTS: new method to detect static locals for contract instrumentation #6679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

remi-delmas-3000
Copy link
Collaborator

@remi-delmas-3000 remi-delmas-3000 commented Feb 18, 2022

The previous method relied on building a call graph and could not be used on functions that had already been through an inlining pass.

Scoping is needed in the case of loop contracts, because we need to detect and havoc only the local statics declared in the loop scope but we should not collect static locals declared outside of the loop's scope.

We first traverses the instructions of a function (or the just the instructions forming a loop in the function), recursing into function calls.

The discovered source locations (function, line, column) are compounded into abstract intervals. We check that every instruction has a function attribute. When a column is missing we widen the location to the whole line, when a line is missing we widen the interval is widened to the whole function.

In a second pass, we collect symbols from the symbol table for which the source location is included in the intervals discovered in the first pass.

When --replace-call-with-contract is used to replace g by its contract in f, pragmas are used to mark
static locals of g in the generated instructions and make sure they are collected propagated automatically to the write set of f and any function that uses f, either calling it or by replacing the call by the contract of f.

This method also works after inlining, because the original source location of an instruction is preserved after inlining. So if f calls g, and g gets inlined in f, we can still identify g as the contributing function and collect g's locally declared static variables from the symbol table.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@remi-delmas-3000 remi-delmas-3000 added bugfix aws Bugs or features of importance to AWS CBMC users labels Feb 18, 2022
@remi-delmas-3000 remi-delmas-3000 self-assigned this Feb 18, 2022
@codecov
Copy link

codecov bot commented Feb 18, 2022

Codecov Report

Merging #6679 (7f5300e) into develop (222d266) will increase coverage by 0.02%.
The diff coverage is 97.12%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #6679      +/-   ##
===========================================
+ Coverage    76.82%   76.84%   +0.02%     
===========================================
  Files         1582     1582              
  Lines       182767   182893     +126     
===========================================
+ Hits        140403   140538     +135     
+ Misses       42364    42355       -9     
Impacted Files Coverage Δ
...nstrument/contracts/havoc_assigns_clause_targets.h 100.00% <ø> (ø)
src/goto-programs/string_abstraction.h 100.00% <ø> (ø)
unit/util/piped_process.cpp 31.29% <33.33%> (ø)
src/ansi-c/c_typecheck_expr.cpp 76.45% <50.00%> (ø)
...oto-instrument/contracts/instrument_spec_assigns.h 94.52% <90.24%> (-5.48%) ⬇️
src/goto-programs/string_abstraction.cpp 92.46% <90.90%> (ø)
src/goto-instrument/contracts/contracts.cpp 89.20% <100.00%> (-0.02%) ⬇️
...trument/contracts/havoc_assigns_clause_targets.cpp 97.18% <100.00%> (+1.62%) ⬆️
...o-instrument/contracts/instrument_spec_assigns.cpp 97.37% <100.00%> (+0.50%) ⬆️
...solvers/smt2_incremental/smt_bit_vector_theory.cpp 100.00% <100.00%> (ø)
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 79c1637...7f5300e. Read the comment docs.

@@ -22,11 +22,14 @@ Date: January 2022
#include <util/simplify_expr.h>

#include <analyses/call_graph.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you still need this?

Comment on lines 113 to 190

size_t line_to_size_t(const irep_idt &line)
{
PRECONDITION(!line.empty());
std::stringstream stream(id2string(line));
size_t res;
stream >> res;
return res;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can (should) use one of the utility functions from util/string2int.{h,cpp} to take care of this for you.

Comment on lines 124 to 128
const goto_programt::const_targett &begin,
const goto_programt::const_targett &end,
covered_linest &covered_lines) const
{
goto_programt::const_targett it(begin);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just pass the iterators by value?

{
goto_programt::const_targett it(begin);

while(it != end)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a for loop might be nicer instead of doing it++ in the body of this while loop?


const auto &found = functions.function_map.find(fun_id);

if(found != functions.function_map.end())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an INVARIANT: all functions that are called must exist. They might not have a body, but AFAIK they must exist.

Comment on lines 183 to 184
const auto &loc_fun = sym.location.get_function();
if(!loc_fun.empty())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a second layer of defense: have you also consider evaluating the symbol name to extract a function prefix from it?

Comment on lines 193 to 198
if(itv->second.contains(line))
{
dest.insert(sym.symbol_expr());
}
else if(itv->second.contains(line + 1))
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would deserve some comment to explain why line and line + 1 are relevant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to get rid of this test by other means

Comment on lines 223 to 224
// traverses instructions the given list of instructions, updating the given
// covered lines map, recursing into function calls.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that say "in the given list?"

// When the traversal terminates, the covered_lines map will contain one
// entry per visited function, with the associated range of lines covered
// by the traversal. Function names and line numbers are collected from source
// locations attached to the instructions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to make this a Doxygen comment.

// included in one of the covered_lines map entries or that is -1 adjacent.
// if an adjacent entry is found, the coverage is expanded and collection is
// run again until a fixpoint is reached (this is to collect series of local
// static declarations that happen at the very begining of a function).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to make this a Doxygen comment.

@feliperodri feliperodri added the Code Contracts Function and loop contracts label Feb 18, 2022
@remi-delmas-3000 remi-delmas-3000 force-pushed the contract-detect-local-statics branch 11 times, most recently from 199b724 to 01fb4de Compare February 22, 2022 15:14
…ation

The previous method relied on building a call graph and
could not be used on functions that had already been through
an inlining pass.

This new method uses source_locations and manages
scoping by tracking the min and max source_locations for
functions that are encountered when traversing instructions.

Then, when collecting statics from the symbol table, we only keep
the ones that are found in the traversed functions and whose
source_location is within the seen intervals.

The method makes the sound pessimistic choice to widen the tracked
intervals and collect variables whenever the encountered
source_locations are undefined or only partially defined.

When replace-call-with-contract is used, extra pragmas are used
to track static locals and make sure they are propagated to the write set
of the function in which the call is replaced.
@remi-delmas-3000 remi-delmas-3000 force-pushed the contract-detect-local-statics branch from 01fb4de to 7f5300e Compare February 22, 2022 16:29
@remi-delmas-3000 remi-delmas-3000 merged commit 3410558 into diffblue:develop Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users bugfix Code Contracts Function and loop contracts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants