-
Notifications
You must be signed in to change notification settings - Fork 274
Don't use temporaries in remove_instanceof #4967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use temporaries in remove_instanceof #4967
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: d34916f).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/121427927
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
Codecov Report
@@ Coverage Diff @@
## develop #4967 +/- ##
===========================================
- Coverage 69.31% 69.3% -0.01%
===========================================
Files 1308 1308
Lines 108315 108305 -10
===========================================
- Hits 75075 75065 -10
Misses 33240 33240
Continue to review full report at Codecov.
|
SVCOMP evaluation: one test made faster (goes from timing out to completing in 15s), otherwise little change. Mostly no change expected for this PR, as it will only affect accesses made after a cast but before the instance type becomes clear due to a virtual function call or similar. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and amazingly well explained.
An improvement on one SV-COMP example is good, but it would be more reassuring to try this out with TG on some real-world benchmarks and see the effect there.
This replaces "if(x == null) result = false; else { tmp = x->@Clsid; result = (tmp == "A" || tmp == "B" || ...) with (x != null && (x->@Clsid == "A" || x->@Clsid == "B" || ...)) Advantages: * The ASSERT/ASSUME pair is more obvious when expressed as a single expression rather than a small if-block, so when asserts are converted to assumes (e.g. for coverage testing) then the redundant assume can be spotted and removed. * By operating directly on a pointer rather than using a temporary (x->@Clsid == "A" rather than id = x->@Clsid; id == "A") we enable symex's value-set filtering to discard no-longer-viable candidates for "x" (in the case where 'x' is a symbol, not a general expression like x->y->@Clsid). This means there are more clean dereference operations (ones where there is no possibility of reading a null, invalid or incorrectly-typed object), and less pessimistic aliasing assumptions possibly causing goto-symex to explore in-fact-unreachable paths. * By not creating a small if-block for every instanceof operation, we spare symex from a fork/join sequence with attendant copying of data structures Disadvantage: * By not using a temporary for the class-identifier, we force symex to carry out the dereference multiple times (i.e. x->@Clsid == "A" || x->@Clsid == "B" will become (x == &o1 ? o1 : o2).@Clsid == "A" || (x == &o1 ? o1 : o2).@Clsid == "B") This could be expensive if either the alias set or the list of allowable types is long.
d34916f
to
581540c
Compare
@romainbrenguier @JohnDumbell would either of you mind trying your benchmarks on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 581540c).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/122594097
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.
Ran my benchmark against it. Lot of methods on Tika it helped, one outlier that added about a second but reductions on others far outweigh that. Ran one run with the change, another with the commit reverted, so otherwise it was exactly the same. |
Awesome! Suggestion: tweak your benchmark output so the summary data is easily paste-able? For example, |
That json was always intended to be consumed by something else to make a nice shiny report, but if I don't get around to it I'll just add a summary like that somewhere. |
This replaces
"if(x == null) result = false; else { tmp = x->@clsid; result = (tmp == "A" || tmp == "B" || ...)
with
(x != null && (x->@clsid == "A" || x->@clsid == "B" || ...))
Advantages:
x->@clsid == "A"
rather thanid = x->@clsid; id == "A"
) we enable symex's value-set filtering to discard no-longer-viable candidates for "x" (in the case where 'x' is a symbol, not a general expression likex->y->@clsid
). This means there are more clean dereference operations (ones where there is no possibility of reading a null, invalid or incorrectly-typed object), and less pessimistic aliasing assumptions possibly causing goto-symex to explore in-fact-unreachable paths.Disadvantage:
x->@clsid == "A" || x->@clsid == "B"
will become(x == &o1 ? o1 : o2).@clsid == "A" || (x == &o1 ? o1 : o2).@clsid == "B")
. This could be expensive if either the alias set or the list of allowable types is long.