Skip to content

Don't phi-merge uninitialised objects #2168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
public class PhiMergeUninitialized {

public int dynamicAllocationUninitialized(Boolean trigger) {

Ephemeral obj;
if (trigger) {
obj = new Ephemeral(42);
} else {
obj = new Aetherial(20);
}

assert obj.val == 20;
return obj.val;
}

private Ephemeral field;

public int fieldUninitialized(Boolean trigger) {
if (trigger) {
field = new Ephemeral(42);
}

assert field.val == 42;
return field.val;
}

private static Ephemeral staticField;

public int staticFieldUninitialized(Boolean trigger) {
if (trigger) {
staticField = new Ephemeral(42);
} else {
staticField = new Aetherial(76);
}

assert staticField.val == 76;
return staticField.val;
}

class Ephemeral {
Ephemeral(int value) {
val = value;
}

int val;
}

class Aetherial extends Ephemeral {
Aetherial(int value) {
super(value);
}
}
}

16 changes: 16 additions & 0 deletions regression/cbmc-java/phi-merge_uninitialized_values/field.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
CORE
PhiMergeUninitialized.class
--function PhiMergeUninitialized.fieldUninitialized --show-vcc
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
^.*:\s+(dynamic_object|new_tmp)[0-9]+(@[0-9]+)?#0\)
^.*\?\s+(dynamic_object|new_tmp)[0-9]+(@[0-9]+)?#0\s+:
--
These regexes are making sure that a variable of generation 0 dosen't appear in a phi merge, so the below
statement:

(guard ? dynamic_object2@3#1 : dynamic_object3@3#0)

Should not appear. First regex deals with when the gen zero is the latter position, second when it's in the former.
16 changes: 16 additions & 0 deletions regression/cbmc-java/phi-merge_uninitialized_values/local.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
CORE
PhiMergeUninitialized.class
--function PhiMergeUninitialized.dynamicAllocationUninitialized --show-vcc
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
^.*:\s+(dynamic_object|new_tmp)[0-9]+(@[0-9]+)?#0\)
^.*\?\s+(dynamic_object|new_tmp)[0-9]+(@[0-9]+)?#0\s+:
--
These regexes are making sure that a variable of generation 0 dosen't appear in a phi merge, so the below
statement:

(guard ? dynamic_object2@3#1 : dynamic_object3@3#0)

Should not appear. First regex deals with when the gen zero is the latter position, second when it's in the former.
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
CORE
PhiMergeUninitialized.class
--function PhiMergeUninitialized.staticFieldUninitialized --show-vcc
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
^.*:\s+(dynamic_object|new_tmp)[0-9]+(@[0-9]+)?#0\)
^.*\?\s+(dynamic_object|new_tmp)[0-9]+(@[0-9]+)?#0\s+:
--
These regexes are making sure that a variable of generation 0 dosen't appear in a phi merge, so the below
statement:

(guard ? dynamic_object2@3#1 : dynamic_object3@3#0)

Should not appear. First regex deals with when the gen zero is the latter position, second when it's in the former.
16 changes: 16 additions & 0 deletions regression/cbmc/phi-merge_uninitialized_values/dynamic.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
CORE
test.c
--function dynamicAllocationUninitialized --show-vcc
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
^.*:\s+dynamic_object[0-9]+(@[0-9]+)?#0\)
^.*\?\s+dynamic_object[0-9]+(@[0-9]+)?#0\s+:
--
These regexes are making sure that a variable of generation 0 dosen't appear in a phi merge, so the below
statement:

(guard ? dynamic_object2@3#1 : dynamic_object3@3#0)

Should not appear. First regex deals with when the gen zero is the latter position, second when it's in the former.
16 changes: 16 additions & 0 deletions regression/cbmc/phi-merge_uninitialized_values/global.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
CORE
test.c
--function globalUninitialized --show-vcc
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
^.*:\s+global(@[0-9]+)?#0\)
^.*\?\s+global(@[0-9]+)?#0\s+:
--
These regexes are making sure that a variable of generation 0 dosen't appear in a phi merge, so the below
statement:

(guard ? global#1 : global#0)

Should not appear. First regex deals with when the gen zero is the latter position, second when it's in the former.
16 changes: 16 additions & 0 deletions regression/cbmc/phi-merge_uninitialized_values/local.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
CORE
test.c
--function localUninitialized --show-vcc
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
^.*:\s+local(@[0-9]+)?#0\)
^.*\?\s+local(@[0-9]+)?#0\s+:
--
These regexes are making sure that a variable of generation 0 dosen't appear in a phi merge, so the below
statement:

(guard ? local#1 : local#0)

Should not appear. First regex deals with when the gen zero is the latter position, second when it's in the former.
16 changes: 16 additions & 0 deletions regression/cbmc/phi-merge_uninitialized_values/static_local.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
CORE
test.c
--function staticLocalUninitialized --show-vcc
activate-multi-line-match
^EXIT=0$
^SIGNAL=0$
--
^.*:\s+staticLocal(@[0-9]+)?#0\)
^.*\?\s+staticLocal(@[0-9]+)?#0\s+:
--
These regexes are making sure that a variable of generation 0 dosen't appear in a phi merge, so the below
statement:

(guard ? staticLocal#1 : dynamic_object#0)

Should not appear. First regex deals with when the gen zero is the latter position, second when it's in the former.
56 changes: 56 additions & 0 deletions regression/cbmc/phi-merge_uninitialized_values/test.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
#include <assert.h>
#include <stdlib.h>

void dynamicAllocationUninitialized(int trigger)
{
int *obj;
if(trigger)
{
obj = (int *)malloc(sizeof(int));
*obj = 42;
}
else
{
obj = (int *)malloc(sizeof(int));
*obj = 76;
}

assert(*obj == 42);
}

int global;
void globalUninitialized(int trigger)
{
if(trigger)
{
global = 44;
}
else
{
global = 20;
}

assert(global == 44);
}

void staticLocalUninitialized(int trigger)
{
static int staticLocal;
if(trigger)
{
staticLocal = 43;
}

assert(staticLocal == 43);
}

void localUninitialized(int trigger)
{
int local;
if(trigger)
{
local = 24;
}

assert(local == 24);
}
2 changes: 1 addition & 1 deletion regression/test.pl
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ ($$$$$$$$$)
binmode $fh;
my $whole_file = <$fh>;
$whole_file =~ s/\r\n/\n/g;
my $is_match = $whole_file =~ /$result/;
my $is_match = $whole_file =~ /$result/m;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, but could it please be put in a separate commit? The explanation in the PR is absolutely necessary to understand this, but will never make it to the commit history at this stage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for factoring it out into a separate commit. Could you please also add the text from the PR to body of the commit message? It's really informative, and should be preserved!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, fixed. Still getting fully used to git.

$r = ($included ? !$is_match : $is_match);
}
else
Expand Down
10 changes: 8 additions & 2 deletions src/goto-symex/goto_symex.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,19 @@ void goto_symext::do_simplify(exprt &expr)
simplify(expr, ns);
}

nondet_symbol_exprt goto_symext::build_symex_nondet(typet &type)
{
irep_idt identifier = "symex::nondet" + std::to_string(nondet_count++);
nondet_symbol_exprt new_expr(identifier, type);
return new_expr;
}

void goto_symext::replace_nondet(exprt &expr)
{
if(expr.id()==ID_side_effect &&
expr.get(ID_statement)==ID_nondet)
{
irep_idt identifier="symex::nondet"+std::to_string(nondet_count++);
nondet_symbol_exprt new_expr(identifier, expr.type());
nondet_symbol_exprt new_expr = build_symex_nondet(expr.type());
new_expr.add_source_location()=expr.source_location();
expr.swap(new_expr);
}
Expand Down
2 changes: 2 additions & 0 deletions src/goto-symex/goto_symex.h
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,8 @@ class goto_symext
exprt &code,
const irep_idt &identifier);

nondet_symbol_exprt build_symex_nondet(typet &type);

// exceptions
void symex_throw(statet &);
void symex_catch(statet &);
Expand Down
6 changes: 6 additions & 0 deletions src/goto-symex/symex_builtin_functions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,12 @@ void goto_symext::symex_allocate(
else
throw "failed to zero initialize dynamic object";
}
else
{
exprt nondet = build_symex_nondet(object_type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point of interest: you've copied the case for x = NONDET(int) rather than the case from code_declt("x", java_int_type()) for example. Check how their behaviour differs if at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see any major deviations - the code isn't identical but the intentions seem the same and most of the code in decl is run in assign at some point too. One thing to note is that assign propagates the nondet as a constant, while the decl code flat-out disables any propagation, would that be an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll only disable it for the time being, but it's interesting. I'm not quite well informed enough to give an authoritative answer -- I suggest running this by @peterschrammel and post his answer here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that this is what #35 is about, and you're doing parts of that here, even though the full value only unfolds once you break down nondet symbols into their individual fields.

code_assignt assignment(value_symbol.symbol_expr(), nondet);
symex_assign(state, assignment);
}

exprt rhs;

Expand Down
12 changes: 12 additions & 0 deletions src/goto-symex/symex_goto.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -425,10 +425,22 @@ void goto_symext::phi_function(

exprt rhs;

// Don't add a conditional to the assignment when:
// 1. Either guard is false, so we can't follow that branch.
// 2. Either identifier is of generation zero, and so hasn't been
// initialized and therefor an invalid target.
if(dest_state.guard.is_false())
rhs=goto_state_rhs;
else if(goto_state.guard.is_false())
rhs=dest_state_rhs;
else if(goto_state.level2_current_count(l1_identifier) == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments briefly explaining why these checks are here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment block at the top of the if.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

{
rhs = dest_state_rhs;
}
else if(dest_state.level2.current_count(l1_identifier) == 0)
{
rhs = goto_state_rhs;
}
else
{
rhs=if_exprt(diff_guard.as_expr(), goto_state_rhs, dest_state_rhs);
Expand Down