-
Notifications
You must be signed in to change notification settings - Fork 274
Symex: propagate assumptions and conditions #4386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symex: propagate assumptions and conditions #4386
Conversation
src/goto-symex/goto_state.cpp
Outdated
if( | ||
previous_state.threads.size() > 1 && | ||
previous_state.write_is_shared(ssa_lhs, ns) == | ||
goto_symex_statet::write_is_shared_resultt::SHARED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a proper review just yet, only poking at what I guessed would be key bits:
- Why can't the goal of this PR be addressed completely by putting the RHS value in the propagation map? (And really just this, nothing else.)
- I would have assumed the one case where we cannot do this is when we're dealing with shared variables. But now the above condition suggests that it's exactly that case that we do put in the propagation map?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- According to our current merge logic, equal L2 generation numbers --> merge is a no-op. We must either introduce the possibility that despite equal L2 generation numbers the constant prop might disagree (moderately costly), or allocate an "administrative" L2 generation, as we do for constant assignment.
- I accidentally flipped the condition while refactoring, fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But doesn't this then introduce spurious merge assignments? And is it actually a problem if constant propagation disagrees?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(the value-set assignment... just seems like we might as well; this way @owen-jones-diffblue's value-set-filtering will have less work to do as we have established a straightforward constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra merge assignments: yes, harmless but some noise, indeed. We must do something about the constant propagator, or else the knowledge will outlive the code dominated by the branch or test.
Basically would you rather handle this through the L2 mechanism, or pay cost prop. to the size of the propagator map at each branch to merge them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth adding an extra set of irep_idt
that lists the constants inferred that way, and is then consulted at join points?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to do that -- will appeal to @kroening for an opinion before doing it though
864a3e0
to
2fca1b8
Compare
The win32 CI failure is inherited from #4199 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 2fca1b8).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/104441777
@@ -82,6 +82,11 @@ class goto_statet | |||
value_set = std::move(other_state.value_set); | |||
} | |||
|
|||
void apply_condition( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please briefly document this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documented at definition site
@@ -22,6 +22,28 @@ Author: Daniel Kroening, [email protected] | |||
#include <analyses/dirty.h> | |||
#include <util/simplify_expr.h> | |||
|
|||
static void apply_goto_condition( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please briefly document this function
src/goto-symex/symex_goto.cpp
Outdated
if(!symex_config.doing_path_exploration) | ||
{ | ||
// In --paths mode we would require the GOTO condition as a constraint | ||
// as well, effectively treating the branch like an assumption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment. Possibly the condition is inverted by mistake.
2fca1b8
to
31451a4
Compare
@owen-jones-diffblue all comments applied |
src/goto-symex/goto_state.cpp
Outdated
/// assumption or a GOTO condition that we just passed through, we can propagate | ||
/// the constant '5' for future 'x' references. If the condition is (y == &o1) | ||
/// then we can use that to populate the value set. | ||
/// \param condition: condition that must hold on this path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should you mention that this should be L2? There's a short end-of-line comment to that effect in the declaration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
No behavioural change.
This enables us to more conveniently allocate a new generation on a non-current state, such as a state copy that will be used on a pending branch. No behavioural change.
l2_thread_write_encoding both makes a quite complicated test regarding whether a write is to be considered shared state, and also enacts some side-effects for both shared and non-shared state. This splits part of its testing into write_is_shared. No behavioural change.
241bae7
to
83ddfc2
Compare
This is now rebased and ready to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks sensible, but if the purpose is performance, it would be nice to post some performance results to at least check that there is no slowdown.
} | ||
|
||
// Try it once more, to check we also know after an unreachable catch that | ||
// there is still no inflight exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 A test with nested try-catch would be useful to prove that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such a case actually does not work yet, because symex can't tell that all exceptions were caught (i.e. the catch dispatch table is exhaustive). The reason is it isn't yet smart enough to notice that upon entering the catch block the incoming pointer cannot be null, and therefore e instanceof Exception certainly succeeds (null instanceof A returns false for any A).
Fixing that is my next task!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add a KNOWNBUG test for it now and turn it on as part of your next task.
The unreachable function should be successfully marked unreachable by symex, | ||
but the final uncaught exception is not yet, as it requires symex to note that | ||
within a catch block @inflight_exception is definitely *not* null, which it | ||
can't just yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what that comment is here for. The test is already marked as CORE
, so what difference should we see when the stated limitation is resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we'll resolve all conditions in symex. I'll amend to make that clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message for "Symex: propagate constants implied by assumptions and conditions" says the constant propagator might be populated with new information. Is it also possible that the value-set gets updated? If so, please correct the commit message.
^warning: ignoring | ||
-- | ||
The unreachable function should be successfully marked unreachable by symex, | ||
but the final uncaught exception is not yet, as it requires symex to note that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what "the final uncaught exception" means. All the exceptions appear to be caught to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It refers to the "did any exception escape at top level?" test that's always generated in __CPROVER_start
, and yes they are all caught, but symex isn't quite smart enough to determine that yet. This PR enables it to determine there's no exception in flight when we return to normal code (preventing exploration of catch blocks after future no-throw function calls), but doesn't yet enable it to determine catch block does not rethrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I'll improve the wording here)
When passing `assume(symbol == constant)` or `if symbol == constant then GOTO`, we can populate the constant propagator and value-set accordingly and use that information until the next merge point without that constraint. We implement this by allocating and defining a fresh L2 generation on this path, which will be merged as "real", assignment-derived generations are. Symbols are subject to propagation under the same conditions as they are on assignment (e.g. requiring that they are not subject to concurrent modification by other threads).
83ddfc2
to
e0d4d78
Compare
@allredj I ran this on some Tika functions and observed that a small batch of 6 functions that took a total of 248s (symex-only) took 231s with this patch. Of course that's a small test, but it gives me some confidence this will be overall beneficial. I'll run a longer benchmark tomorrow to be sure and then merge if the results are also positive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Extended benchmark results show a slight improvement, so merging this as-is and we can implement @tautschnig's suggestion re: using different data structures to record a path-local constant if and when we want to. |
This propagates simple tests and assumptions (e.g.
ASSUME(x == 5)
, orIF y == null GOTO 1
) similar to how assignments are already propagated.My particular use case is to note that non-exceptional return from a Java function (dominated by
IF @inflight_exception != NULL
) and a catch block (terminated by@inflight_exception = NULL;
) mean that we can be certain that there is no exception still in flight when executing normal code. However as the C tests accompanying this PR indicate, the concept is more general and is likely to have other benefits by eliminating e.g. redundant return-value or error checks.This depends on #4199 (for easier L2 generation number management) and #4379 (to fix byte-update / SMT2 handling exposed by this PR), so please review only the last 5 commits.