-
Notifications
You must be signed in to change notification settings - Fork 274
Prevent multiple instrumentation of a same check and instrumentation of instrumentation assertions #6471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b3ad1d3
to
1655871
Compare
Codecov Report
@@ Coverage Diff @@
## develop #6471 +/- ##
===========================================
+ Coverage 76.07% 76.27% +0.19%
===========================================
Files 1548 1546 -2
Lines 166359 165627 -732
===========================================
- Hits 126563 126329 -234
+ Misses 39796 39298 -498
Continue to review full report at Codecov.
|
404748e
to
1c232cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from comments as below: can this change please be turned into multiple commits as follows:
- Introduce a message handler to
goto_check
. [no change in behaviour expected, no tests required] - Changes to the
ansi_c_parsert
API. [no change in behaviour expected, no tests required] - Adding support for
pragma CPROVER enable
[new feature, includes documentation updates and tests] - Changes to flag (re)setting [no change in behaviour expected, no tests required, could perhaps be done before the above commit.]
- Support for "checked:..." [tests expected as this avoids redundant instrumentation]
- Adding "disable:..." [tests expected]
@@ -0,0 +1,27 @@ | |||
CORE broken-smt-backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious: any idea why this is failing with the SMT back-end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly have no idea. Seems related to some constructs being rejected by the SMT parser
^\[main\.overflow\.\d+\] line 49 arithmetic overflow on signed type conversion in \(char\)\(signed int\)i \+ 1\): FAILURE | ||
^\[main\.overflow\.\d+\] line 50 arithmetic overflow on signed \+ in j \+ 1: FAILURE | ||
-- | ||
This test uses all possible named-checks to maximize coverage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit pick: newline at end of file.
src/analyses/goto_check.cpp
Outdated
/// if the flag was already seen before. | ||
void disable(bool &flag) | ||
{ | ||
if(flag != false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(!flag)
would seem more idiomatic.
src/analyses/goto_check.cpp
Outdated
resetter.set_flag(enable_pointer_primitive_check, false, d.first); | ||
else if(d.first == "enable:bounds-check") | ||
resetter.set_flag(enable_bounds_check, true, d.first); | ||
else if(d.first == "enable:pointer-check") | ||
resetter.set_flag(enable_pointer_check, true, d.first); | ||
else if(d.first == "enable:memory_leak-check") | ||
resetter.set_flag(enable_memory_leak_check, true, d.first); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Randomly placing my comment in this block of code: I think we should now look at refactoring this piece of code, perhaps as follows:
- string-split
id2string(d.first)
on:
- parse the first component into an enum { DISABLE, ENABLED, CHECKED }
- lookup the second component in a map that is initialised in the constructor (the map will map strings or
irep_idt
tobool*
orbool&
) - perform a
set_flag
operation on the result of that lookup, directed by the enum value.
As is, this will become a maintenance nightmare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @tautschnig that this would be better / easier to review as several commits.
Also, not wanting to be "that guy" but can we briefly discuss the motivation for this PR? You give two interesting and relevant use cases. In the first one, if you run the instrumentation multiple times, you get multiple copies of the assertion -- I am kind of OK with this? Is avoiding this a major problem for your workflow(s)? In the second case, should goto-check be looking at code in assertions by default, or really at all? CPROVER assertions are "execution neutral" in that they are just checks, they can't affect the values of variables or alter execution[1], so why are the checks for undefined behaviour looking in them? At least shouldn't the checks on expressions within assertions do things like convert e
to !undefined_cases_of_e => e
?
[1] But what about...
int g = 0;
int f00() { __CPROVER_assert(f00() != 0) ; return ++g; }
1c232cb
to
70aada0
Compare
src/analyses/goto_check.cpp
Outdated
@@ -277,6 +278,23 @@ class goto_checkt | |||
bool enable_assumptions; | |||
bool enable_pointer_primitive_check; | |||
|
|||
/// Maps a named-check name to the corresponding boolean flag. | |||
std::map<irep_idt, bool *> name_to_flag{ | |||
{"no-enum-check", &no_enum_check}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please insert a commit at the beginning of your commit series that entirely removes no_enum_check
from goto_check.cpp? It is always false
. Seems like an unnecessary piece of code introduced in 85b90f3, probably had a meaning at some point while developing that change back then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flag is an override to the user-controlled enable_enum_check
that is set to true before checking LHS in assignments:
// Reset the no_enum_check with the flag reset for exception
// safety
{
flag_resett resetter(i);
resetter.set_flag(no_enum_check, true, "no_enum_check");
check(assign_lhs);
}
If we remove this flag CBMC will start checking LHS expressions, I think the problem could be that these assertions will trigger when the LHS is uninitialised and can take any integer value.
src/analyses/goto_check.cpp
Outdated
} check_statust; | ||
|
||
/// optional (named-check, status) pair | ||
typedef optionalt<std::pair<irep_idt, check_statust>> named_check_statust; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CODING_STANDARD.md
says that we nowadays prefer using
over typedef
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I also change all other typedef
in this file into using
?
src/analyses/goto_check.cpp
Outdated
/// | ||
/// \returns a pair (name, status) if the match succeeds | ||
/// and the name is known, nothing otherwise. | ||
named_check_statust match_named_check(irep_idt named_check); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const irep_idt &named_check
; also, should this method be const
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/analyses/goto_check.cpp
Outdated
|
||
/// \brief Adds "checked" pragmas for all currently active named checks | ||
/// on the given source location. | ||
void add_active_named_check_pragmas(source_locationt &source_location); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be marked const
src/analyses/goto_check.cpp
Outdated
|
||
/// \brief Adds "disable" pragmas for all named checks | ||
/// on the given source location. | ||
void add_all_disable_named_check_pragmas(source_locationt &source_location); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/analyses/goto_check.cpp
Outdated
// inhibit already checked checks | ||
for(auto flag : already_checked) | ||
resetter.disable(*flag); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need this already_checked
list? Can't you invoke resetter.disable(...)
directly earlier on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check pragmas have priority over enable so they must be applied last.
I have moved the priority logic inside flag_resett
src/analyses/goto_check.cpp
Outdated
void goto_checkt::add_active_named_check_pragmas( | ||
source_locationt &source_location) | ||
{ | ||
for(auto it : name_to_flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid "it" as a name for this isn't an iterator. Also, I'm never quite sure what the exact type here is, so I tend to use const auto &entry : name_to_flag
to make sure it's a const reference and not a copy.
src/analyses/goto_check.cpp
Outdated
void goto_checkt::add_all_disable_named_check_pragmas( | ||
source_locationt &source_location) | ||
{ | ||
for(auto it : name_to_flag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above: const reference.
src/analyses/goto_check.cpp
Outdated
if(s.find("enable:") == 0) | ||
{ | ||
status = check_statust::ENABLE; | ||
} | ||
else if(s.find("disable:") == 0) | ||
{ | ||
status = check_statust::DISABLE; | ||
} | ||
else if(s.find("checked:") == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use has_prefix(s, "...")
instead of s.find(...) == 0
.
src/analyses/goto_check.cpp
Outdated
return {}; | ||
} | ||
|
||
auto sep = s.find(":"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of the above branches would actually have a constant known value for this. Perhaps avoid the duplicate find
? Alternatively, do the find
early on and then just compare the substrings up to that known position of the colon.
Hi @martin-cs , the motivation for this PR is that the function contract and loop contract workflows involve several distinct goto-instrument passes that introduce new instrumentation variables, assertions and assumptions in a goto program which then get analysed using CBMC once all transformations are applied. This ultimate analysis pass uses any automatic checks the user decides to activate (pointer-checks, overflow-checks, enum-checks, etc.). Some of this contract instrumentation is execution neutral and some of it requires extra checks:
Our contract instrumentation happens early on and in later passes, with the current goto-instrument and cbmc behaviour, it gets picked up and instrumented if it were user code. This results in an explosion of the number of VCCs (x10, from 8k to 80k on our larger benchmarks) when all checks are activated in the last stage. Most of them being useless and hurting performance badly, some of them causing systematic errors. Most importantly we cannot guarantee that any of the checks we need for the soundness of our instrumentation will eventually be performed (eventhough our template Makefile for contracts proofs has all checks activated, the user can always decide to not activate pointer overflow checks and we have no control over this). So this PR, combined with this other PR #6450 which introduces "enable" pragmas, gives us the level of control we need to ensure required checks will be instrumented exactly once, and that unnecessary checks or redundant checks will not be instrumented, while letting the user freely decide on what checks to instantiate on his own code. One thing to be noticed is that these "checked" pragmas are not exposed to the user so the possibilities for abuse are limited.
This is precisely what this PR is trying to fix. My take is that checks should be instantiated on user-provided assertions that are part of the original program, but not necessarily on all instrumentation code and assertions (though in some cases checks might be required to ensure soundness, as is the case with write set inclusion checking). A question to you : is there any fundamental difference of intent in uses of I was under the impression that
Assuming we're talking about
Can |
f37fc08
to
057a147
Compare
…rumentation. When several invocations to goto-instrument/cmbc are performed with named-check flags, the same assertions were generated several times, and assertions generated by a check A in some pass would be instrumented by a check B in some ulterior pass. "checked" pragmas allow to keep track of which checks have been performed on which instructions and are saved in the goto binaries, hence surviving across invocations of the tools. The presente of a "checked" pragma for a given check on an instruction prevents future invocations of the tools to generate again the assertions of that check. We also tag all assertions generated by goto_check with "disable" pragmas to avoid instrumenting instrumentation assertions in ulterior invocations of the tool.
057a147
to
0aaf191
Compare
0aaf191
to
4125dbe
Compare
4125dbe
to
0aaf191
Compare
@remi-delmas-3000 thank you for the detailed and thoughtful exploration of the context. I feel like I understand your workflow a bit better now. I can see that it is a problem and none of the work-flows changes I proposed will easily fix it. I am not 100% convinced there isn't an over-all better way but I think that would be a bit "global optimum" and I can see this is an improvement, even if to a local optimum.
I think they should do different things. I think |
I suppose I should try to find some time for #5866 to enshrine this. |
I would be very careful with this. Note that assert(0) does not abort when NDEBUG is set, which is often the case in production binaries. The failing assertion may then hide a genuine security vulnerability. |
This PR addresses the following issues:
goto-instrument
multiple times with a same named-check switch creates redundant assertions:--pointer-overflow-check
followed by--unsigned-overflow-check
in differentgoto-instrument
invocations results in thepointer-overflow-check
assertions to be instrumented forunsigned-overflow-check
, and causes systematic errors.Running these passes
results in this systematic error
[main.overflow.2] line 12 arithmetic overflow on unsigned + in (unsigned long int)(signed long int)OBJECT_SIZE(buf) + 18446744073709551615ul: FAILURE
The PR adds the following in
goto_check
:named-check
, either because the command line switch was used or because an "enable" pragma was added by the user in the source program, we add a pragma"checked:named-check"
to the instruction. These pragmas get saved to goto binary files and persist across tool invocations. We check for the presence of"checked:named-check"
pragmas to ensure check assertions are only generated once."disable:named-check"
pragmas for all named checks to all assertions generated bygoto_checkt::add_guarded_assertion
. This will prevent these check assertions to be instrumented by other checks in future invocations ofgoto-instrument
orcbmc
.The "checked" pragmas are reserved for internal use and are not exposed to the user in the front-end.