-
Notifications
You must be signed in to change notification settings - Fork 273
Symex-dereference: simplify after deref [blocks: #2574, #4056] #3725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a0025af
to
6f940ce
Compare
Do not misuse symex dereferencing code for finding array objects [blocks: #3725]
6f940ce
to
743e988
Compare
fd0f92d
to
95ae2b9
Compare
This requires the byte_update fixes from #2068 to make the SMT solver happy (adding the simplify step changes a nested |
// when all we need is | ||
// s1=s1 with member:=X [and guard b] | ||
// s2=s2 with member:=X [and guard !b] | ||
do_simplify(expr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious to see what the impact on performance this change has.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will benchmark (and I hope @romainbrenguier can do his benchmarking as well) once the dependencies #2068 and #3770 are merged so that we have a precise analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could maybe be made faster with a simplify method which would focus on the conversions that are mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is certainly one option - are there particular patterns that affect Java? (@smowton maybe?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: this appears to be causing performance degradation; we may need to look into a specialised simplifier to achieve this.
src/goto-symex/symex_dereference.cpp
Outdated
// make sure simplify has not re-introduced any dereferencing that | ||
// had previously been cleaned away | ||
INVARIANT( | ||
!has_subexpr(expr, ID_dereference), "simplify re-introduced dereferencing"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems quite heavy to run on each call to dereference
. Do we yet have a way to avoid evaluating invariants (and other assertions) in deployed binaries of CBMC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I have now wrapped it in if(symex_config.run_validation_checks)
.
src/goto-symex/symex_dereference.cpp
Outdated
// applying field sensitivity, which can turn such field-of-symbol expressions | ||
// into atomic SSA expressions, but would have to rewrite all of 'o1' | ||
// otherwise. | ||
// make the structure of the lhs as simple as possible to avoid, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line seems redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment cleaned up.
95ae2b9
to
6c387cc
Compare
Thanks for running those benchmarks. Is there any evidence that this helps for Java? |
Summary here: this isn't expected to cause a performance increase by itself (and indeed will definitely cost a little more). It is primarily a pre-requisite for field-sensitive symex, because of the member-of-if -> if-of-member transformation mentioned in the comments added by this PR. I don't see a clear way to restrict simplify's activities except to keep track of expressions which are already maximally simplified and suppress repeated simplification. I've seen a couple of drafts by @tautschnig which attempt this, either by the simplifier tracking ireps it has seen and simplified, or by attaching comments to simplified ireps, or by adding an extra boolean to |
I guess that's fine, but in that case it could have been better for it to be part of the field-sensitive PR. |
FWIW, tautschnig@ef00e0a is the most serious of those. It does seem to work well, but @kroening rightfully remarked that it opens the door to everyone asking for their little pet flag. But we'll get there when the other performance-optimising PRs have been reviewed/merged so that I can start posting those changes as PRs. Yet, as @smowton said, it's one more reason I'm not at all worried about performance cost here. I have no issue merging this as part of #2574 as @romainbrenguier suggested, but I think it was important to debug and evaluate it separately as this has lead to a lot of type-related bug fixes. |
aeda45a
to
9abfc97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 9abfc97).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102907891
9abfc97
to
f1ce452
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: f1ce452).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/103024631
Status will be re-evaluated on next push.
Common spurious failures:
-
the cbmc commit has disappeared in the mean time (e.g. in a force-push)
-
the author is not in the list of contributors (e.g. first-time contributors).
-
the compatibility was already broken by an earlier merge.
f1ce452
to
427df5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 427df5c).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/103062461
Status will be re-evaluated on next push.
Common spurious failures:
-
the cbmc commit has disappeared in the mean time (e.g. in a force-push)
-
the author is not in the list of contributors (e.g. first-time contributors).
-
the compatibility was already broken by an earlier merge.
I benchmarked 133 Tika methods and found total runtime = 4818s with this PR + #4314, 4860s with just #4314, 5974s with just develop. Most of that reduction comes from 3 tests that were previously timeouts (timeout = 300s) that now complete in 30 seconds; the rest comes from 10-15s reductions in many tests that previously took 10-20s to analyse. I expected to find a major penalty for not including this PR alongside #4314, because #4314 causes |
Successfully reproduced a micro-benchmark exposing the kind of problem I was expecting: Test program: public class Test {
public static void main() {
Generic<Integer> g = new GenericSub<Integer>();
int x = 0;
for(int i = 0; i < 3000; ++i)
x += g.get();
assert x == 0;
}
}
class Generic<T> {
T key; // Only here to make sure the compiler doesn't throw <T> away
int x;
public int get() { return 0; }
public Generic() {
key = null;
x = 5;
}
}
class GenericSub<S> extends Generic<S> {
public int get() { return x; }
} Times for |
Excerpt from
|
This is generally useful because it turns member-of-if into if-of-member, which avoids repeatedly quoting large complex if expressions in a with_exprt, but it is especially good for field sensitivity as it puts member expressions directly on top of symbol expressions, which field sensitivity knows how to deal with.
This was fine with develop with and without simplification after dereferencing, but would break if removing base_type_eq without adding the simplification step.
427df5c
to
0b09ffa
Compare
Thanks a lot @smowton for the detailed analysis! I have added the test that you provided as a second commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 0b09ffa).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/103459224
This is generally useful because it turns member-of-if into if-of-member, which
avoids repeatedly quoting large complex if expressions in a with_exprt, but it
is especially good for field sensitivity as it puts member expressions directly
on top of symbol expressions, which field sensitivity knows how to deal with.
Performance evaluation is required
, but also this change shows that some tests newly fail with this change, which is unexpected. See #3724 for the first one of them.