Skip to content

Field sensitive goto-symex #2574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 16, 2019
Merged

Conversation

tautschnig
Copy link
Collaborator

@tautschnig tautschnig commented Jul 11, 2018

This is the rebased version of an old patch collection. It's likely broken at present, but should be a starting point for discussing and evaluating this.

@smowton Has put a lot of work into this, and this PR now reflects our combined effort. It is now ready for review.

@martin-cs
Copy link
Collaborator

+1 #doeswant

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments

Forall_operands(it, expr)
operator()(ns, *it, write);

if(expr.id()==ID_symbol &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_ssa_expr

bool write) const
{
#if 1
if(expr.id()!=ID_address_of)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment here why address-of is special?

{
member_exprt &member=to_member_expr(expr);

if(member.struct_op().id()==ID_symbol &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_ssa_expr


member.struct_op()=tmp.get_original_expr();

tmp.type()=member.type();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does tmp retain any aspect of the original struct-op here?

++it, ++number)
{
member_exprt member_rhs(lhs, it->get_name(), it->type());
exprt member_lhs=lhs_fs.operands()[number];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const ref

for(unsigned i=0; i<array_size_u; ++i)
{
index_rhs.index()=from_integer(i, index_type());
exprt index_lhs=lhs_fs.operands()[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const ref

field_assignments_rec(ns, state, target, member_lhs, member_rhs);
}
}
else if(followed_type.id()==ID_array)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth retaining the ability to represent a whole-array assignment compactly? String constants are the main thing that springs to mind -- then break it down into smaller pieces once the program starts making individual array cell assignments?

@@ -227,11 +227,117 @@ void goto_symext::symex_assign_symbol(
tmp_ssa_rhs.swap(ssa_rhs);
}

state.field_sensitivity(ns, ssa_rhs, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear whether this does anything or just accrues state -- give the function name a verb


Purpose:

\*******************************************************************/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doxygen (entire PR)

dest=struct_exprt(ssa_expr.type());
dest.reserve_operands(components.size());

for(struct_union_typet::componentst::const_iterator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ranged for (entire PR)

const struct_union_typet::componentst &components=
type.components();

assert(components.empty() ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INVARIANT (entire PR)

(lhs_fs.has_operands() &&
lhs_fs.operands().size()==components.size()));

unsigned number=0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t

for(exprt::operandst::const_iterator it=lhs.operands().begin();
it!=lhs.operands().end();
++it, ++fs_it)
field_assignments_rec(ns, state, target, *fs_it, *it);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

braces

mp_integer array_size;
if(to_integer(type.size(), array_size))
assert(false);
unsigned array_size_u=integer2unsigned(array_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t?

exprt &expr,
bool write) const
{
#if 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we have command line options to turn on/off field-sensitivity for structs/arrays to make it easier to perform experiments with it?

}
#endif

std::cerr << from_expr(ns, "", ssa_rhs) << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deactivate debug output

^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
\(Starting CEGAR Loop\|VCC(s), 0 remaining after simplification$\)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the CEGAR Loop here? ^^

@@ -234,6 +234,50 @@ void goto_symext::symex_assign_symbol(

ssa_exprt lhs_mod=lhs;

if(ssa_rhs.id()==ID_byte_update_little_endian ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test that triggers this case?

@tautschnig tautschnig self-assigned this Sep 4, 2018
@tautschnig tautschnig force-pushed the field-sensitivity branch 3 times, most recently from ad814f9 to d5b7a9b Compare November 1, 2018 22:48
@tautschnig tautschnig requested a review from pkesseli as a code owner November 1, 2018 22:48
@tautschnig tautschnig mentioned this pull request Dec 11, 2018
5 tasks
lhs_mod = to_ssa_expr(fs_lhs);
}
#endif

do_simplify(ssa_rhs);

ssa_exprt ssa_lhs=lhs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ssa_exprt ssa_lhs=lhs;
ssa_exprt ssa_lhs = lhs_mod;

This makes regression/cbmc/Anonymous_Struct2 pass, and a few others.

/// \param ns: a namespace to resolve type symbols/tag types
/// \param state: symbolic execution state
/// \param target: symbolic execution equation store
/// \param lhs_f: expanded symbol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lhs_fs

@@ -453,6 +458,10 @@ bool goto_symex_statet::l2_thread_read_encoding(
(!ns.lookup(obj_identifier).is_shared() && !(dirty)(obj_identifier)))
return false;

// is it an indivisible object being accessed?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment here on why in that case the rest of this function is inapplicable

tautschnig added a commit that referenced this pull request Dec 19, 2018
Skip phi assignment if one of the merged states has an uninitialised object [blocks: #35, #2574, #3486]
Copy link
Contributor

@romainbrenguier romainbrenguier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only partially reviewed the first commit and I think it should be split into several commits.

tautschnig added a commit that referenced this pull request Jan 9, 2019
Ensure ssa_expr's l1-identifier is consistent with its identifier [blocks: #2574]
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 7f16d25).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/107604152
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 4c23625).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/107608202
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 092d8fb).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/107795688

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 5b86aba).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/108113827

@romainbrenguier
Copy link
Contributor

Here are the results on some benchmarks. Field sensitivity seems to scale much better. Total time was
76.41 instead of 98.24. However two benchmarks failed because of invariant violations, I will try to produce a minimal example for these failures.

chart (1)

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 098c0cb).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/108161278

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 9329cc2).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/108198940

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 5dd0b3a).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/108226499

@kroening kroening added the Epic label Apr 15, 2019
smowton and others added 10 commits April 16, 2019 11:55
field_sensitivity::apply needs to avoid expanding struct-typed symbols
into fields when they're being used as lvalues.
Struct member and array index expressions are replaced by new SSA
symbols such that updates to individual fields of structs or arrays do
not affect the SSA index of the remaining object. This enables more
fine-grained constant propagation, which should be particularly
beneficial to Java (where structs/classes prevail).
The Array_Propagation1 test is no longer solved during symex in this
case.
This makes it easier to read the difference between an atomic symbol and
a field of a composite.
Previously this could turn e.g. struct@x#y into struct.field@x#y,
regardless of whether the L2 generation #y for that field existed or
not. This meant that when an L2 name was passed through
field-sensitivity expansion it could end up referring to an unbound
symbol, whereas an L1 name would remain coherent (e.g.  struct@x ->
struct.field@x), later being renamed using the ordinary level2_names
map. That means the generation numbers for structs and their fields may
get out of sync, but that should be harmless so long as "latest
generation" is used consistently.

We can now also remove constant propagator lookup after applying field
sensitivity, because field_sensitivity::apply uses state.rename to apply
L2 renaming, which itself consults the constant propagator.

Also make is_divisible test cheaper, at the expense that synchronisation
with get_fields needs to be maintained manually.
This is because a pointer into the middle of a struct can resolve into
something like &struct.field, in which case we need to rewrite that as
the field-sensitive SSA symbol before any further dereferencing.
…updated

Otherwise e.g. in "memset2" we had the whole object being initialised
with "{0, 0}" (which is stored in const prop and then expanded to
individual fields), then later any operation that needed the
whole-struct representation (such as the memset-of-one-field in that
test, which expanded to "whole_struct = byte_update(whole_struct,
offset, int, 0)") would get the whole-struct definition stored in
const-prop, disregarding any intermediate updates that were able to
directly reference a single field.

This avoids having two different versions of a symbol (the
field-of-composite and the indivisible field versions) in the most
direct way possible.
…mations

The function name was previously incorrect (it did more than it said),
so now I just call both transformations from symex_assign.
Otherwise while everything works correctly for the state, the target
sees the field expansions before the composite definition, which is a
problem if the values are non-constant and so the expansion RHSes refer
to the composite!
…s already L2

For an L1 or lower expression, get_fields already recursed, e.g. to turn
`x` -> `{ { x..a..g, x..a..h }, x..b }` rather than `{ x..a, a..b }`
when its `a` member is itself struct-typed. However this recursion was
omitted in the case where the input expression is already L2 renamed.
@tautschnig tautschnig merged commit e681845 into diffblue:develop Apr 16, 2019
@tautschnig tautschnig deleted the field-sensitivity branch April 16, 2019 14:56
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 8cd443c).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/108538361

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Epic Needs TG approval 🦉 Only merge with explicit approval from test-gen maintainers Symbolic Execution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants