-
Notifications
You must be signed in to change notification settings - Fork 273
Symex: resolve pointer comparisons using the value-set #4444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Symex: resolve pointer comparisons using the value-set #4444
Conversation
src/goto-symex/goto_symex.h
Outdated
@@ -287,6 +287,10 @@ class goto_symext | |||
statet &, | |||
bool keep_array); | |||
|
|||
exprt try_evaluate_pointer_comparisons( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any suggestions for a better name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this one!
src/goto-symex/goto_symex.h
Outdated
@@ -333,6 +337,11 @@ class goto_symext | |||
const exprt &new_guard, | |||
const namespacet &ns); | |||
|
|||
optionalt<exprt> build_reference_to_value_set_element( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function might fit better somewhere else - perhaps in value_set_dereferencet
, since it's mainly a wrapper around value_set_dereferencet::build_reference_to
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proooobably. No strong opinion.
src/goto-symex/symex_goto.cpp
Outdated
op = try_evaluate_pointer_comparisons(std::move(op), value_set); | ||
} | ||
} | ||
else if(condition.id() == ID_not) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anything else I should be recursing over, other than AND, OR and NOT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I think that's enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, how about the arguments of a boolean-typed if_exprt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(e.g. x == y ? z == w : a == b
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or indeed the first arg of an if_exprt
in any context. Total recursion looking for ==
and !=
nodes might be the simplest thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I 100% think these comments through before beginning to write FYI
src/goto-symex/symex_goto.cpp
Outdated
{ | ||
symbol_exprt *symbol_expr = | ||
expr_try_dynamic_cast<symbol_exprt>(condition.op0()); | ||
exprt &constant = condition.op1(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it assumes the symbol is on the lhs and the constant is on the rhs. I will fix that up.
src/goto-symex/symex_main.cpp
Outdated
@@ -583,6 +583,38 @@ find_unique_pointer_typed_symbol(const exprt &expr) | |||
return return_value; | |||
} | |||
|
|||
optionalt<exprt> goto_symext::build_reference_to_value_set_element( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code was in goto_symext::try_filter_value_sets
and I've pulled it out into a function to reuse it. Perhaps I'll make that a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good except I believe the null case is broken, and at least needs tests.
possibly_reachable_12 = 7; | ||
|
||
int unconditionally_reachable_13; | ||
if((ptr_to_a == &a && c == 3) || c == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bear in mind these constant comparisons with c
may well be resolved very early as c
is only assigned once and never address-taken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed them to other pointer comparisons. And I now take the address of all the integers at the beginning, if that's a thing that symex cares about.
src/goto-symex/symex_goto.cpp
Outdated
expr_try_dynamic_cast<symbol_exprt>(condition.op0()); | ||
exprt &constant = condition.op1(); | ||
|
||
if(symbol_expr && goto_symex_is_constantt()(constant)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle the reverse case too (constant OP symbol)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and test this)
src/goto-symex/symex_goto.cpp
Outdated
if(symbol_expr && goto_symex_is_constantt()(constant)) | ||
{ | ||
value_setst::valuest value_set_elements; | ||
value_set.get_value_set( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert that symbol_expr
is already L1 at least
src/goto-symex/symex_goto.cpp
Outdated
to_ssa_expr(*symbol_expr).get_l1_object(), value_set_elements, ns); | ||
|
||
INVARIANT( | ||
constant.id() == ID_address_of || constant.id() == ID_NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprising! Null is usually represented as an ID_constant
with value = NULL -- suspect you need to add tests for this
src/goto-symex/symex_goto.cpp
Outdated
|
||
if(!unknown_or_invalid_found) | ||
{ | ||
if(!constant_found) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give some brief comments setting out the logic here, as !constant_found
and value_set_elements.size() == 1
don't immediately chime with "x in { a, b, c }" and "x in { x }" respectively.
618b074
to
1081b00
Compare
I've addressed all of @smowton's review comments on the draft PR and got CI working (touch wood). |
1081b00
to
eb55978
Compare
|
||
// Equality "==" | ||
|
||
// A non-null pointer and the correct value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct -> matching and incorrect -> non-matching everywhere
|
||
// Disequality "!=" | ||
|
||
// A non-null pointer and the correct value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the -> a
test::1::unreachable_12[^\s]+ = 7$ | ||
^warning: ignoring | ||
-- | ||
Pointer comparisons will be resolved in symex by a mixture of constant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding conditions that check you check for everything you should, by making sure unreachable_1[3-9])
and unreachable[2-9][0-9]
for example don't occur anywhere (i.e., if someone added a test but not a cross-check this regex would complain)
src/goto-symex/symex_goto.cpp
Outdated
|
||
INVARIANT( | ||
other_operand.id() == ID_address_of || | ||
(typecast_expr && typecast_expr->op().id() == ID_address_of) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're not sure this is simplified you might get redundant stacked typecast ops
src/goto-symex/symex_goto.cpp
Outdated
expr_try_dynamic_cast<symbol_exprt>(symbol_expr); | ||
|
||
if(!symbol_expr_lhs) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for braces with one-liner if
s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to use them in all cases
src/goto-symex/symex_goto.cpp
Outdated
/// \param value_set: The value-set for looking up what the symbol can point to | ||
/// \param language_mode: The language mode | ||
/// \param ns: A namespace | ||
static bool try_evaluate_pointer_comparisons_base_case_with_check( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calling this twice, suggest something like https://github.com/diffblue/cbmc/blob/develop/src/goto-symex/goto_state.cpp#L67
src/goto-symex/symex_goto.cpp
Outdated
/// \param value_set: The value-set for looking up what the symbol can point to | ||
/// \param language_mode: The language mode | ||
/// \param ns: A namespace | ||
static void try_evaluate_pointer_comparisons_base_case( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about try_evaluate_pointer_comparison
, singular?
src/goto-symex/symex_goto.cpp
Outdated
/// \param value_set: The value-set for looking up what the symbol can point to | ||
/// \param language_mode: The language mode | ||
/// \param ns: A namespace | ||
static bool try_evaluate_pointer_comparisons_base_case_with_check( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's much gained by the two functions (with_check vs. without) since they're both only called in one place
Note also that the last two commits break sharing everywhere, so we'll have to rethink how to do that. |
Suggested change: smowton@1daa2c6 That removes @owen-jones-diffblue note that also applies a couple of my suggestions above since it made it easier to code :) |
Yes that looks better. What I'm not sure about is the fact that the mutator function takes an |
eb55978
to
b20167c
Compare
Thanks @smowton, that's much better. I've cherry-picked it and squashed it in, along with fixes for all the things you mentioned. I did add an extra commit which makes There are three failing tests. The first is
in test.desc and
in vccs.desc. The other two are nondet_elements_longer_lists and nondet_elements_longer_lists_global, which are failing in the same way. I'm looking into that at the moment. |
For nondet_elements_longer_lists and nondet_elements_longer_lists_global, the test is failing to find these two lines: ```
|
@owen-jones-diffblue unsure about the unwinding assertions (yet) but the exception one is entirely desirable -- that 1 VCC remaining was a shortcoming which we have now fixed (symex can tell no exception escaped from the whole program). Just amend the test to check all conditions are symex'd away. |
@romainbrenguier I'm not sure I follow, what would your ideal interface look like? |
@owen-jones-diffblue Presumably your changes just result in more constant propagation, and thus those unwinding assertions are resolved by the simplifier already. #2574 does the same, i.e., removes those lines from the test spec. |
I would think having |
Ah sure, ok. On the one hand the tag is useful if you want to call a function that demands a |
b20167c
to
44c25df
Compare
I've marked this as |
I've cherry-picked ##4454 and added a commit to use it. Before merging this should be cleaned up. |
cbmc CORE unit tests times: unchanged |
Large Webgoat example: 55 seconds to reach a particular checkpoint (down 5 seconds) |
96d2f30
to
13b6290
Compare
Romain's Sakai benchmark: One benchmark reduced around 15 seconds (from 145 seconds), the remainder tested showed negligible difference. |
Therefore I'm pretty confident this has no noticeable cost in context of the other fairly-expensive ops we perform in symex. |
13b6290
to
ff6002b
Compare
const symbol_exprt &symbol_expr, | ||
const exprt &other_operand, | ||
const value_sett &value_set, | ||
const irep_idt language_mode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const &
src/goto-symex/symex_goto.cpp
Outdated
/// \return If we were able to evaluate the condition as true or false then we | ||
/// return that, otherwise we return an empty optionalt | ||
static optionalt<renamedt<exprt, L2>> try_evaluate_pointer_comparison( | ||
irep_idt operation, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const &
src/goto-symex/symex_goto.cpp
Outdated
static optionalt<renamedt<exprt, L2>> try_evaluate_pointer_comparison( | ||
const renamedt<exprt, L2> &renamed_expr, | ||
const value_sett &value_set, | ||
const irep_idt language_mode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const &
src/goto-symex/symex_goto.cpp
Outdated
static renamedt<exprt, L2> try_evaluate_pointer_comparisons( | ||
renamedt<exprt, L2> condition, | ||
const value_sett &value_set, | ||
const irep_idt language_mode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const &
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: ff6002b).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/111618837
value_set_dereferencet::valuet is only used as the return value of value_set_dereferencet::build_reference_to. valuet::value is an expression for the object pointed to, which is what's needed for dereferencing. In other contexts, e.g. filtering value-sets, what is needed is an expression for the pointer. This logic already existed in try_filter_value_set, but putting it directly in build_reference_to makes it easier to use it in other places and makes it clear that it didn't cope with typecasts properly.
Use the value-set to evaluate pointer comparisons as true or false where possible
At time of writing, develop fails on the following lines of the test: test::1::unconditionally_reachable_1[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_1[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_6[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_6[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_7[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_7[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_8[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_8[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_9[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_9[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_10[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_10[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_11[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_11[^\s]+\)$ [FAILED] test::1::unconditionally_reachable_12[^\s]+ = \(goto_symex::\\guard[^\s]+ \? 7 : test::1::unconditionally_reachable_12[^\s]+\)$ [FAILED] test::1::unreachable_1[^\s]+ = 7$ [FAILED] test::1::unreachable_6[^\s]+ = 7$ [FAILED] test::1::unreachable_7[^\s]+ = 7$ [FAILED] test::1::unreachable_8[^\s]+ = 7$ [FAILED] test::1::unreachable_9[^\s]+ = 7$ [FAILED] test::1::unreachable_10[^\s]+ = 7$ [FAILED] test::1::unreachable_11[^\s]+ = 7$ [FAILED] test::1::unreachable_12[^\s]+ = 7$ [FAILED]
ff6002b
to
213a1b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 213a1b2).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/111750931
I'm afraid this actually causes spurious counterexamples being reported. I'll try to build a small regression test. |
Actually it's pretty simple:
I believe the problem is that |
Discussion on Slack summary: will revert and re-submit with a fix |
This is to guard against the regression earlier introduced in diffblue#4444 (and reverted in diffblue#4656).
This allows us to evaluate boolean expressions (including GOTO conditions) that test pointer equality or inequality (particularly "ptr == null" or "!= null") where the value-set contains sufficient information. We can already handle simple cases where
ptr
has a single known alias via constant propagation and aliasing, but not cases whereptr
is non-constant, for example because it is drawn from an array.