-
Notifications
You must be signed in to change notification settings - Fork 274
[TG-9390] GOTO-symex: propagate notequal conditions for boolean types #5091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TG-9390] GOTO-symex: propagate notequal conditions for boolean types #5091
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 357dfbf).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/126092608
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
Codecov Report
@@ Coverage Diff @@
## develop #5091 +/- ##
===========================================
+ Coverage 69.69% 69.7% +<.01%
===========================================
Files 1321 1320 -1
Lines 109360 109336 -24
===========================================
- Hits 76219 76208 -11
+ Misses 33141 33128 -13
Continue to review full report at Codecov.
|
357dfbf
to
2cca3cd
Compare
This is a cleaned-up version of #5089, with a few bugs fixed and tests added. I added tests in C and java take make sure it worked with the boolean types in both languages. The last commit was a part of the code from #5089 that I found wasn't needed for any of my tests. It seems to be doing the right kind of thing, so one argument is to keep it in so that if the any of the language frontends start producing GOTO like that then we'll still be able to determine the value of the symbol. The counter-argument is that it isn't tested, so it might not be doing the right thing, and it's misleading to put in code for a case that doesn't currently happen. |
2cca3cd
to
ab866c0
Compare
One test is failing (regression/cbmc/Empty_struct1). The problem is that we declare
We then condition on The only reasonable resolutions I can think of are (a) restricting |
I'd suggest only doing this for real booleans (bool_typet, 1-bit) not cbools (8-bit, conventionally either 1 or 0). That loses some cases but ensures compliance with C, and the static init fields I set out to handle are "proper" booleans |
(presumably this is because of the unrestricted typecasting in the equals case) |
@smowton Okay, that makes sense. |
82dabf3
to
6bf2df6
Compare
src/goto-symex/goto_state.cpp
Outdated
for(const auto &op : condition.operands()) | ||
apply_condition(op, previous_state, ns); | ||
} | ||
else if(condition.id() == ID_not) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ expr_try_dynamic_cast<not_exprt>
🚜
src/goto-symex/goto_state.cpp
Outdated
else if(condition.id() == ID_not) | ||
{ | ||
const exprt &operand = to_not_expr(condition).op(); | ||
if(operand.id() == ID_notequal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚜
src/goto-symex/goto_state.cpp
Outdated
previous_state, | ||
ns); | ||
} | ||
else if(operand.id() == ID_equal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚜
src/goto-symex/symex_goto.cpp
Outdated
} | ||
exprt negated_new_guard = new_guard.id() == ID_not | ||
? to_not_expr(new_guard).op() | ||
: not_exprt(new_guard); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use boolean_negate
src/goto-symex/symex_goto.cpp
Outdated
current_state, | ||
ns); | ||
} | ||
exprt negated_new_guard = new_guard.id() == ID_not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
src/goto-symex/symex_goto.cpp
Outdated
current_state, | ||
ns); | ||
} | ||
exprt negated_new_guard = new_guard.id() == ID_not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 6bf2df6).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/126485597
6bf2df6
to
96155fb
Compare
It would be good to have some benchmark results to see if get a performance improvement from this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 96155fb).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/126563612
src/goto-symex/goto_state.cpp
Outdated
@@ -10,6 +10,7 @@ Author: Romain Brenguier, [email protected] | |||
#include "goto_symex_is_constant.h" | |||
#include "goto_symex_state.h" | |||
|
|||
#include <util/arith_tools.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used? (After a quick scan I couldn't see an obvious use)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I think I must have added it for to_integer
in an earlier version, but I'm not using that any more. I've removed it.
I benchmarked this branch vs develop on 13 methods from tika-parsers that had been chosen because they don't complete quickly. I ran with a timeout of 10 minutes, and 64.6 | 63.2 The runtime decreased by 2%, 13%, 9% and 71% respectively. The first one is within normal variation. The middle 2 show a small but noticeable improvement. But the last one shows without a doubt that this PR can drastically reduce symex run times. |
@kroening @peterschrammel @tautschnig This needs a code owner review - do any of you have time to look at it? (@smowton wrote most of it, so he probably shouldn't review it as well.) |
@owen-mc-diffblue Indeed the last bench mark is impressive. Could you kindly state which benchmark tool you used. Thanks. |
I used the benchmarking script in test-gen, written by @JohnDumbell. I used the following list of methods, provided by @smowton
The method with the impressive speed-up was |
96155fb
to
c9c7198
Compare
Conditional branches on "x != false", as well as "x" and "!x", can and should lead to constant propagation just as "x == true" does now. This is particularly beneficial for the "clinit_already_run" variables that are maintained to determine if a static initialiser should be run or not -- the current "if(already_run != true) clinit(); already_run = true;" condition now leaves symex certain that it has been run at the bottom of the function regardless of the already_run flag's initial value, whereas before it could remain uncertain and so explore the static initialiser more than is necessary.
All the main branches are now covered
c9c7198
to
5bc512a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 5bc512a).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/127177250
java::ClassWithStaticInit::clinit_already_run#[1-9][0-9] | ||
^warning: ignoring | ||
-- | ||
After the second call to ClassWithStaticInit.getStaticValue(), symex should have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe explain how many generations you expect to be per call without the optimisation. That might it easier to understand why you expect >=10 to be bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point Peter. Actually, the number of generations per call could varies depending on whether apply_condition
thinks it can set a value to be constant on one or both of the branches, and also future changes. I think it's safest to just say that we expect at least one new generation per call if symex doesn't know that clinit_already_run
is true. I have increased the number of static function calls to 20, to make absolutely sure that we'd go over if the feature isn't working properly.
5bc512a
to
d312dad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: d312dad).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/127355957
@antlechner has confirmed that regression tests are sufficient, this in addition to the benchmarks results (see above) means this is now QA approved. |
Previously we couldn't properly track when we had definitely already called them.
d312dad
to
a2cbcc2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: a2cbcc2).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/127405553
Conditional branches on "x != false", as well as "x" and "!x", can and should lead to constant
propagation just as "x == true" does now. This is particularly beneficial for the
"clinit_already_run" variables that are maintained to determine if a static initialiser should
be run or not -- the current "if(already_run != true) clinit(); already_run = true;" condition
now leaves symex certain that it has been run at the bottom of the function regardless of the
already_run flag's initial value, whereas before it could remain uncertain and so explore the
static initialiser more than is necessary.