Skip to content

TG-58 Java new array data in symex #1425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
67 changes: 16 additions & 51 deletions src/goto-programs/builtin_functions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -569,35 +569,15 @@ void goto_convertt::do_java_new(
const side_effect_exprt &rhs,
goto_programt &dest)
{
if(lhs.is_nil())
{
error().source_location=lhs.find_source_location();
error() << "do_java_new without lhs is yet to be implemented" << eom;
throw 0;
}

PRECONDITION(!lhs.is_nil());
PRECONDITION(rhs.operands().empty());
PRECONDITION(rhs.type().id() == ID_pointer);
source_locationt location=rhs.source_location();

assert(rhs.operands().empty());

if(rhs.type().id()!=ID_pointer)
{
error().source_location=rhs.find_source_location();
error() << "do_java_new returns pointer" << eom;
throw 0;
}

typet object_type=rhs.type().subtype();

// build size expression
exprt object_size=size_of_expr(object_type, ns);

if(object_size.is_nil())
{
error().source_location=rhs.find_source_location();
error() << "do_java_new got nil object_size" << eom;
throw 0;
}
CHECK_RETURN(object_size.is_not_nil());

// we produce a malloc side-effect, which stays
side_effect_exprt malloc_expr(ID_malloc);
Expand All @@ -624,36 +604,18 @@ void goto_convertt::do_java_new_array(
const side_effect_exprt &rhs,
goto_programt &dest)
{
if(lhs.is_nil())
{
error().source_location=lhs.find_source_location();
error() << "do_java_new_array without lhs is yet to be implemented"
<< eom;
throw 0;
}
PRECONDITION(!lhs.is_nil()); // do_java_new_array without lhs not implemented
PRECONDITION(rhs.operands().size() >= 1); // one per dimension
PRECONDITION(rhs.type().id() == ID_pointer);

source_locationt location=rhs.source_location();

assert(rhs.operands().size()>=1); // one per dimension

if(rhs.type().id()!=ID_pointer)
{
error().source_location=rhs.find_source_location();
error() << "do_java_new_array returns pointer" << eom;
throw 0;
}

typet object_type=rhs.type().subtype();
PRECONDITION(ns.follow(object_type).id() == ID_struct);

// build size expression
exprt object_size=size_of_expr(object_type, ns);

if(object_size.is_nil())
{
error().source_location=rhs.find_source_location();
error() << "do_java_new_array got nil object_size" << eom;
throw 0;
}
CHECK_RETURN(!object_size.is_nil());

// we produce a malloc side-effect, which stays
side_effect_exprt malloc_expr(ID_malloc);
Expand All @@ -664,9 +626,12 @@ void goto_convertt::do_java_new_array(
t_n->code=code_assignt(lhs, malloc_expr);
t_n->source_location=location;

assert(ns.follow(object_type).id()==ID_struct);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this assertion is no longer required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already checked by to_struct_type on the next line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and also line 613

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it is sometimes useful to pull the check out to allow you to document in this specific case why it must be a struct, but this is a preference.

const struct_typet &struct_type=to_struct_type(ns.follow(object_type));
assert(struct_type.components().size()==3);

// Ideally we would have a check for `is_valid_java_array(struct_type)` but
// `is_valid_java_array is part of the java_bytecode module and we cannot
// introduce such dependencies. We do this simple check instead:
PRECONDITION(struct_type.components().size()==3);

// Init base class:
dereference_exprt deref(lhs, object_type);
Expand Down Expand Up @@ -696,7 +661,6 @@ void goto_convertt::do_java_new_array(
// Allocate a (struct realtype**) instead of a (void**) if possible.
const irept &given_element_type=object_type.find(ID_C_element_type);
typet allocate_data_type;
exprt cast_data_member;
if(given_element_type.is_not_nil())
{
allocate_data_type=
Expand All @@ -705,7 +669,8 @@ void goto_convertt::do_java_new_array(
else
allocate_data_type=data.type();

side_effect_exprt data_java_new_expr(ID_java_new_array, allocate_data_type);
side_effect_exprt data_java_new_expr(
ID_java_new_array_data, allocate_data_type);

// The instruction may specify a (hopefully small) upper bound on the
// array size, in which case we allocate a fixed-length array that may
Expand Down
6 changes: 4 additions & 2 deletions src/goto-programs/goto_convert.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -760,8 +760,10 @@ void goto_convertt::convert_assign(

do_java_new_array(lhs, to_side_effect_expr(rhs), dest);
}
else if(rhs.id()==ID_side_effect &&
rhs.get(ID_statement)==ID_malloc)
else if(
rhs.id() == ID_side_effect &&
(rhs.get(ID_statement) == ID_malloc ||
rhs.get(ID_statement) == ID_java_new_array_data))
{
// just preserve
Forall_operands(it, rhs)
Expand Down
12 changes: 4 additions & 8 deletions src/goto-programs/goto_trace.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,9 @@ void goto_trace_stept::output(
case goto_trace_stept::typet::DEAD: out << "DEAD"; break;
case goto_trace_stept::typet::OUTPUT: out << "OUTPUT"; break;
case goto_trace_stept::typet::INPUT: out << "INPUT"; break;
case goto_trace_stept::typet::ATOMIC_BEGIN: out << "ATOMC_BEGIN"; break;
case goto_trace_stept::typet::ATOMIC_BEGIN:
out << "ATOMIC_BEGIN";
break;
case goto_trace_stept::typet::ATOMIC_END: out << "ATOMIC_END"; break;
case goto_trace_stept::typet::SHARED_READ: out << "SHARED_READ"; break;
case goto_trace_stept::typet::SHARED_WRITE: out << "SHARED WRITE"; break;
Expand Down Expand Up @@ -91,7 +93,7 @@ void goto_trace_stept::output(

out << "\n";

if(pc->is_other() || pc->is_assign())
if((pc->is_other() && lhs_object.is_not_nil()) || pc->is_assign())
{
irep_idt identifier=lhs_object.get_object_name();
out << " " << from_expr(ns, identifier, lhs_object.get_original_expr())
Expand Down Expand Up @@ -386,14 +388,8 @@ void show_goto_trace(
break;

case goto_trace_stept::typet::CONSTRAINT:
UNREACHABLE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of change probably at least requires something in the commit message as to why these are now reasonable states to reach (Someone obviously thought they were unreachable before...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UNREACHABLE is still present (line 392)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah of course as it now falls through

break;

case goto_trace_stept::typet::SHARED_READ:
case goto_trace_stept::typet::SHARED_WRITE:
UNREACHABLE;
break;

default:
UNREACHABLE;
}
Expand Down
14 changes: 11 additions & 3 deletions src/goto-programs/interpreter_evaluate.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ Author: Daniel Kroening, [email protected]
#include <util/pointer_offset_size.h>
#include <string.h>

/// reads a memory address and loads it into the dest variable marks cell as
/// read before written if cell has never been written
/// Reads a memory address and loads it into the `dest` variable.
/// Marks cell as `READ_BEFORE_WRITTEN` if cell has never been written.
void interpretert::read(
const mp_integer &address,
mp_vectort &dest) const
Expand Down Expand Up @@ -102,7 +102,12 @@ void interpretert::clear_input_flags()
}
}

/// \return Number of leaf primitive types; returns true on error
/// Count the number of leaf subtypes of `ty`, a leaf type is a type that is
/// not an array or a struct. For instance the count for a type such as
/// `struct { (int[3])[5]; int }` would be 16 = (3 * 5 + 1).
/// \param ty: a type
/// \param [out] result: Number of leaf primitive types in `ty`
/// \return returns true on error
bool interpretert::count_type_leaves(const typet &ty, mp_integer &result)
{
if(ty.id()==ID_struct)
Expand Down Expand Up @@ -299,6 +304,9 @@ bool interpretert::memory_offset_to_byte_offset(
}
}

/// Evaluate an expression
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this summary adds much, consider elaborating on what evaluating an expression means

/// \param expr: expression to evaluate
/// \param [out] dest: vector in which the result of the evaluation is stored
void interpretert::evaluate(
const exprt &expr,
mp_vectort &dest)
Expand Down
6 changes: 3 additions & 3 deletions src/goto-symex/symex_assign.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,9 @@ void goto_symext::symex_assign(

throw "symex_assign: unexpected function call: "+id2string(identifier);
}
else if(statement==ID_cpp_new ||
statement==ID_cpp_new_array ||
statement==ID_java_new_array)
else if(
statement == ID_cpp_new || statement == ID_cpp_new_array ||
statement == ID_java_new_array_data)
symex_cpp_new(state, lhs, side_effect_expr);
else if(statement==ID_malloc)
symex_malloc(state, lhs, side_effect_expr);
Expand Down
7 changes: 4 additions & 3 deletions src/goto-symex/symex_builtin_functions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -376,8 +376,9 @@ void goto_symext::symex_cpp_new(
if(code.type().id()!=ID_pointer)
throw "new expected to return pointer";

do_array=(code.get(ID_statement)==ID_cpp_new_array ||
code.get(ID_statement)==ID_java_new_array);
do_array =
(code.get(ID_statement) == ID_cpp_new_array ||
code.get(ID_statement) == ID_java_new_array_data);

dynamic_counter++;

Expand All @@ -393,7 +394,7 @@ void goto_symext::symex_cpp_new(
if(code.get(ID_statement)==ID_cpp_new_array ||
code.get(ID_statement)==ID_cpp_new)
symbol.mode=ID_cpp;
else if(code.get(ID_statement)==ID_java_new_array)
else if(code.get(ID_statement) == ID_java_new_array_data)
symbol.mode=ID_java;
else
INVARIANT_WITH_IREP(false, "Unexpected side effect expression", code);
Expand Down
1 change: 1 addition & 0 deletions src/util/irep_ids.def
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ IREP_ID_TWO(cpp_new_array, cpp_new[])
IREP_ID_TWO(cpp_delete_array, cpp_delete[])
IREP_ID_ONE(java_new)
IREP_ID_ONE(java_new_array)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is java_new_array still needed?

Copy link
Contributor

@smowton smowton Oct 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously ID_java_new_array was being used to denote both the new-array-structure operation (yields something like {int length; void** data;}*) and the new-plain-array op (which yields a void**, like a C++ new[]). It was getting away with it because of the order builtin_functions.cpp looked for the different cases, but it was very fragile in case that changed. This way the two ops have distinct names (new_array and new_array_data respectively) and it's clear which one is meant.

IREP_ID_ONE(java_new_array_data)
IREP_ID_ONE(java_string_literal)
IREP_ID_ONE(printf)
IREP_ID_ONE(input)
Expand Down
1 change: 1 addition & 0 deletions unit/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ SRC += unit_tests.cpp \
analyses/does_remove_const/does_expr_lose_const.cpp \
analyses/does_remove_const/does_type_preserve_const_correctness.cpp \
analyses/does_remove_const/is_type_at_least_as_const_as.cpp \
goto-programs/goto_trace_output.cpp \
java_bytecode/java_bytecode_convert_class/convert_abstract_class.cpp \
java_bytecode/java_bytecode_parse_generics/parse_generic_class.cpp \
java_bytecode/java_object_factory/gen_nondet_string_init.cpp \
Expand Down
26 changes: 26 additions & 0 deletions unit/goto-programs/goto_trace_output.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/*******************************************************************\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realised - this still needs to be added to the Makefile as cbmc is using Make and CMake in parallel.


Module: Unit tests for goto_trace_stept::output

Author: Diffblue Limited. All rights reserved.

\*******************************************************************/

#include <testing-utils/catch.hpp>
#include <goto-programs/goto_program_template.h>
#include <goto-programs/goto_trace.h>
#include <iostream>

SCENARIO(
"Output trace with nil lhs object",
"[core][goto-programs][goto_trace]")
{
symbol_tablet symbol_table;
namespacet ns(symbol_table);
goto_programt::instructionst instructions;
instructions.emplace_back(goto_program_instruction_typet::OTHER);
goto_trace_stept step;
step.pc = instructions.begin();
step.type = goto_trace_stept::typet::ATOMIC_BEGIN;
step.output(ns, std::cout);
}