Skip to content

Improve cprover_string_literal argument handling TG-1245 #1691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

romainbrenguier
Copy link
Contributor

cprover_string_literal was requiring a string constant as argument,
but we now deal with if expressions as well.
This also improve solver handling of expressions of type string_typet by recording the symbols in the symbol_resolve map.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if has_string_subtype is doing what you want to, could you clarify the intent, perhaps by giving some examples of types you expect to pass and fail has_string_subtype's test?

}
}

for(const auto &t : type.subtypes())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't mean subtype in the Java sense, but rather the arguments of a type in the algebraic datatype sense -- for example, the "subtypes" of a struct are the types of its components (therefore the struct special case above is redundant), the "subtype" of a pointer is the type it points to... and so on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smowton

therefore the struct special case above is redundant

I'm not sure it is: in a struct the subtypes appear in the components, which is a named_sub for the irept and not in sub which is what subtypes returns

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added to the documentation the precision that we mean subtype in the algebraic sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah so it is. No problem then.

@romainbrenguier romainbrenguier force-pushed the bugfix/getClass#TG-1245 branch 2 times, most recently from 887aa92 to 0f0cf97 Compare January 3, 2018 14:01
/// \param res: string expression for the result
/// \param arg: expression of type string typet
/// \param guard: condition under which `res` should be equal to arg
/// \return 0 if constraints were added, 1 if expression could not be handled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is with the return value of if_exprt ? This reads as if only 1 or 0 can be returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in case of if_expr we return the maximum of the true case or false case, so it will always be 0 or 1

{
if(has_string_subtype(t, ns))
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would for the above two loops using std::any_of be easier to understand?

@romainbrenguier romainbrenguier force-pushed the bugfix/getClass#TG-1245 branch 2 times, most recently from c595a2d to 781ecdc Compare January 4, 2018 15:53
Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions for reducing code duplication, but I think the important thing is there should be some tests.

@@ -312,7 +312,11 @@ bool is_char_pointer_type(const typet &type)

/// \param type: a type
/// \param ns: name space
/// \return true if a subtype is an pointer of characters
/// \return true if a subtype of `type` is an pointer of characters.
/// The meaning of "subtype" is in the algebraic datatype sense:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be being dense, but I don't completely get this, perhaps an example (e.g. a pointer to a char or struct with a char component)

@@ -336,6 +340,39 @@ static bool has_char_pointer_subtype(const typet &type, const namespacet &ns)
return false;
}

/// \param type: a type
/// \param ns: name space
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no space between name and space

/// etc...
static bool has_string_subtype(const typet &type, const namespacet &ns)
{
if(type == string_typet())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pull the common code between this method and has_char_pointer_subtype into a separate method (has_type_have_subtype(const typet & type, const namespacet &ns, const typet &required_subtype))

/// for example, the subtypes of a struct are the types of its
/// components, the subtype of a pointer is the type it points to,
/// etc...
static bool has_string_subtype(const typet &type, const namespacet &ns)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a couple of unit tests for this if you don't mind.

}
else if(const auto constant_expr = expr_try_dynamic_cast<constant_exprt>(arg))
return add_axioms_for_constant(res, constant_expr->get_value(), guard);
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment explaining why in this case a 1 expression is a reasonable return might be good.

const exprt &arg = args[2];
irep_idt sval=to_constant_expr(arg).get_value();
return add_axioms_for_constant(res, sval);
return add_axioms_for_cprover_string(res, args[2], true_exprt());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to see a test for whatever case is now supported. I can't say the changes in this file mean much to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be tests in other repositories: https://github.com/diffblue/test-gen/pull/1327
But it is difficule to test on CBMC because we don't have a model for Class and the main affected method is Object.getClass.

An example that should work afterwards is the following:

public class testMaybeNull1
{
    public Object x;
    public String check()
    {
        return x.getClass().getName();
    }
}

The reason is that getClass converts a cprover_string to a java string (the @class_identifier), but since the object could be null, the argument passed to cprover_string_literal is more complicated than a constant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be sufficent to have a:

public class simpleClaz {
  String getStr() { return "hello"; }
}
/////////////
public class testMaybeNull1
{
    public simpleClaz x;
    public String check()
    {
        return x.getStr();
    }
}

Since x may or may not be null isn't it the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this example is already handled correctly. It is the @class_identifier field that is a problem, and as far as I now it can only be access via instanceof and getClass. And I think getClass is the only function which generate a cprover_string_literal with something else than a constant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is difficule to test on CBMC because we don't have a model for Class and the main affected method is Object.getClass.

Getting 'java core models' into the cbmc repo would help. @cesaro has started some work on that.

if(type == string_typet())
return true;

if(type.id() == ID_struct || type.id() == ID_union)
Copy link
Contributor

@thk123 thk123 Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment got hidden but I think still applies:

Can you pull the common code between this method and has_char_pointer_subtype into a separate method (has_type_have_subtype(const typet & type, const namespacet &ns, const typet &required_subtype))

@romainbrenguier romainbrenguier force-pushed the bugfix/getClass#TG-1245 branch 5 times, most recently from 2d29ed2 to d5a7b70 Compare January 8, 2018 10:16
Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - looks good, just a couple of very small nits but no need for re-review

return true;

// clang-format off
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have been so far using clang-formats style where it disagrees with the linter (And using // NOLINTNEXTLINE)

/// for example, the subtypes of a struct are the types of its
/// components, the subtype of a pointer is the type it points to,
/// etc...
/// For instance in the type `t` defined by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

// clang-format off
return has_subtype(
type,
[](const typet &subtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think you can just pass in is_char_pointer_type rather than wrapping in predicate

return has_subtype(type, is_char_pointer_type);

@romainbrenguier romainbrenguier force-pushed the bugfix/getClass#TG-1245 branch 4 times, most recently from 060a2f8 to d71d5dd Compare January 10, 2018 09:42
This allows to add a condition under which we want the result to be
equal to the given constant.
By default this is true, meaning that the result should always equal the constant.
cprover_string_literal was requiring a string constant as argument,
but we now deal with if expressions as well.
So expressions like:
`cprover_string_literal_func!0#0(cprover_string_length$3!0@1#2,
nondet_infinite_array$5!0@1, this!0@1#1 == &tmp_object_factory$17!0 ?
tmp_object_factory$17!0#4.@class_identifier :
this$object#0.@class_identifier)`
are correctly handled by the string solver.
We take into account expressions of type string_typet in the
symbol_resolve object, so that these values can be propagated.
This can be particularly usefull for getting class identifiers as
strings, since these are represented as expression type string_typet in
the goto program.

This also adds a has_subtype function with predicate as arg to generalize what has_char_pointer_subtype and
has_string_subtype are doing.
We reuse has_subtype to simplify the function.
Also get rid of the namespace argument that was not used.
@romainbrenguier romainbrenguier force-pushed the bugfix/getClass#TG-1245 branch 5 times, most recently from d4bf503 to 57f0999 Compare January 11, 2018 13:25
@romainbrenguier
Copy link
Contributor Author

@thk123 @mgudemann @smowton the version you reviewed was causing some performance problem on some tests (on a private repository), so I improved the code to only consider string equations that are actually used in a function. Can you have a look at the 3 last commits?

@peterschrammel can you have a look at this PR?

const exprt &arg = args[2];
irep_idt sval=to_constant_expr(arg).get_value();
return add_axioms_for_constant(res, sval);
return add_axioms_for_cprover_string(res, args[2], true_exprt());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is difficule to test on CBMC because we don't have a model for Class and the main affected method is Object.getClass.

Getting 'java core models' into the cbmc repo would help. @cesaro has started some work on that.

class equation_symbol_mappingt
{
public:
// Record index of the equations that contain a given expressions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expression

}

/// Symbol resolution for expressions of type string typet.
/// We record equality between this expressions in the output if one of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these


/// Symbol resolution for expressions of type string typet.
/// We record equality between this expressions in the output if one of the
/// function calls depend on them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends

}
}

// transitively add all equation which depends on the equations to treat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equations depend

class equation_symbol_mappingt
{
public:
// Record index of the equations that contain a given expressions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expression

@@ -116,4 +116,9 @@ bool has_subtype(
const typet &type,
const std::function<bool(const typet &)> &pred);

union_find_replacet string_identifiers_resolution_from_equations(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment that this is required to unit-test this function.

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of nitpicks from the new commits

strings_in_equation[i].push_back(expr);
}

std::vector<exprt> find_expressions(const std::size_t &i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little point taking a value type by const ref

{
const std::vector<exprt> strings_in_comp = extract_strings_from_lhs(
member_exprt(lhs, comp.get_name(), comp.type()));
for(const auto s : strings_in_comp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result.insert(result.end(), strings_in_comp.begin(), strings_in_comp.end())

Copy link
Contributor

@thk123 thk123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed final 3 commits - looks reasonable

@@ -479,12 +481,118 @@ static union_find_replacet generate_symbol_resolution_from_equations(
return solver;
}

/// Symbol resolution for expressions of type string typet
class equation_symbol_mappingt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be great to add a doxygen summary of what this mapping is for

return result;
}

/// \param eq: equation to add
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing summary comment

{
const std::size_t i = equations_to_treat.top();
equations_to_treat.pop();
for(const exprt &string : equation_map.find_expressions(i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit since these for loops spread over multiple lines please add braces for clarity

@@ -620,6 +752,12 @@ decision_proceduret::resultt string_refinementt::dec_solve()

const union_find_replacet string_id_symbol_resolve =
string_identifiers_resolution_from_equations(equations, ns, debug());
#ifdef DEBUG
debug() << "symbol resolve string:" << eom;
for(const auto &pair : string_id_symbol_resolve.to_vector())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since body of loop over multiple lines please add braces for clarity

We number equations to make it easier to debug function manipulating
equations, in particular string_identifiers_resolution_from_equations.
We also output symbol resolution information for strings
For cprover strings, adding all the equations to the union_find_replace
structure could sometimes become counter performant.
Instead here we look at what string expression are need for the function
calls and in which equations they appear.
@romainbrenguier romainbrenguier force-pushed the bugfix/getClass#TG-1245 branch from d45677b to 23c3561 Compare January 11, 2018 20:48
@romainbrenguier romainbrenguier merged commit 0d7310a into diffblue:develop Jan 12, 2018
smowton added a commit to smowton/cbmc that referenced this pull request May 9, 2018
d190fd8 Merge remote-tracking branch 'upstream/develop' into pull-support-20180112
5bd5962 Merge pull request diffblue#1667 from romainbrenguier/feature/type_cast
8ecb55a Merge pull request diffblue#1717 from smowton/smowton/feature/remove_virtual_functions_per_function
0d7310a Merge pull request diffblue#1691 from romainbrenguier/bugfix/getClass#TG-1245
2064849 Use type_checked_cast in boolbv_width
c5fc351 Validate data in pointer_typet in to_pointer_typet
35905c6 Add can_cast_type, validate_type for pointer_typet
b72bacf Define type_try_dynamic_cast and type_checked_cast
23c3561 Unit test for string symbol resolution
c13e602 Adding only needed equations in symbol resolution
ae4deff Debug information for string equations
912828d JBMC: Run remove-virtual-functions as each function is converted
a711c64 Introduce mechanism for renumbering an individual GOTO program
e308a32 Merge pull request diffblue#1679 from NlightNFotis/nondet_extra_test
e6ceb91 Merge pull request diffblue#1724 from tautschnig/fix-visitor
ea74bed Add extra test for nondet-static flag and arrays
4f74896 Use irept API, not implementation-level API
c55b4a5 Merge pull request diffblue#1682 from martin-cs/fix/dependence-graph-namespace-lifespan
1a2c14b Merge pull request diffblue#1722 from diffblue/unsafe_type_constructors
957a568 Merge pull request diffblue#1677 from NlightNFotis/pb4_develop
9c5add4 remove deprecated constructors for three bitvector types
c96e02a no longer use deprecated constructors for some bitvector types
954060e Add unit test for has_subtype
3dd3877 Refactor has_char_pointer_subtype with has_subtype
4699c13 Extend symbol resolution to string_typet
74144fc Handle if_exprt in add_axioms_for_string_literal
c6c1b3f Add an optional guard to add_axioms_for_constant
933d635 Merge pull request diffblue#1716 from mgudemann/fix/null_check_for_java_instanceof
1659314 Merge pull request diffblue#1715 from smowton/smowton/cleanup/jbmc_unused_passes
9c457b7 Add regression test for null instanceof.
2080cd3 Complete instanceof for Java.
d4300d0 Merge pull request diffblue#1697 from diffblue/nondet_symbol_exprt
1c68dd4 Merge pull request diffblue#1714 from tautschnig/c-library-strcat
44b5bae Merge pull request diffblue#1698 from thomasspriggs/tg1633
c4304ba JBMC: Remove C-only passes
bb8cfaa C library: Check upper bounds in memset, memcpy, memmove
7d4984f C library: Implement strcat, strncat
2a5cea2 This introduces nondet_symbol_exprt, which is generated by symbolic execution in response to side_effect_expr_nondett
85193a0 Merge pull request diffblue#1694 from NathanJPhillips/feature/add-raw-lhs-to-trace
d9122dc Merge pull request diffblue#1710 from NathanJPhillips/feature/remove_instanceof_per_function
092df69 Switch from custom file / path routines to Boost-filesystem
c8821b2 Allow to remove instanceof when remove exceptions
94b7658 Don't pass iterators into function calls
a9c4e4f Added regression tests
76318ce Protect extended trace behind a command line option
69b0ff1 Added base_name in comments for all symbols
e86080a Add raw LHS irep field to trace output
ddd1b7a Add remove_instanceof overload to remove from a particular instruction
1c227b7 Merge pull request diffblue#1660 from smowton/smowton/fix/lazy_methods_array_parameters
ae89c94 Lazy loading: assume concrete generic parameter types are needed
80eb6a6 TG-1877: Include array pointer types in needed classes
1053e5f Fix for [TG-1633] Inner generic types have incorrect type identifier
e2cda1a Merge pull request diffblue#1704 from tautschnig/fix-copy-paste
ef4a65e Fix op1/op0 copy&paste typo
21ea31f Merge pull request diffblue#1702 from peterschrammel/goto-diff-java
c4bc953 Merge pull request diffblue#1701 from peterschrammel/allow-instrument-jdk
2811363 Java regression test for goto-diff
43d2e09 Also reset fresh temporary symbol counter
9ef28f4 Compare relative goto target offsets
eaf3a7d Get source location from symbol table
ab59659 Allow instrumentation of java.* and org.cprover.*
6fbd59c Merge pull request diffblue#1631 from tautschnig/fix-pointer-minus
7c04b5c Merge pull request diffblue#1699 from NathanJPhillips/feature/reset-main-in-tests
5e0f186 Pointer difference over void* is difference over char*
faf8f00 Merge commit 'a83b52cddbed22304372c276512c63701eb3aedb' into pull-support-20180104
8236db4 Merge pull request diffblue#1419 from peterschrammel/refactor/cover-instrument
a580e27 Merge pull request diffblue#1689 from smowton/smowton/feature/get_this
591511a Allow callers of load_java_class to pass the name of the main function
1b86b27 Merge pull request diffblue#1687 from smowton/smowton/feature/class-hierarchy-dot
fd2bf6a Merge pull request diffblue#1688 from smowton/smowton/feature/parameter_indices
f570ce5 Merge pull request diffblue#1696 from smowton/smowton/fix/identical_struct_equality
61b0d6d Merge pull request diffblue#1666 from mgudemann/bugfix/removed_required_virtual_calls
3365054 Add regression test
2b6dc8b Resolve concrete function call if no implementation is found
3f1fd64 Add code_typet::get_parameter_indices
42cf61a Fix testing for empty line in test desc file
2090000 Fix missing newline at end of desc file
e448d5f Fix unsatisfiable test line
f7f033d String smoke tests: ensure no type mismatches are seen
b627c3d Replace unsound struct-cast simplification
8fa42b3 Class hierarchy: add DOT output, unit tests
04f2faf Mark GOTO instructions with unresolved virtual calls
aac181f Pass command line options via optionst
b6fa3e8 Factorize source location initialization
8da5395 Document cover functions
a7f0c3d Introduce cover instrumenter
873627a Split cover into several files
0fc08f3 Replace cover-function-only by cover-include-pattern
1f2102c Add code_typet::get_this
2801f0f Avoid crashing when --dependence-graph is used by correcting namespace scoping.
acac776 Add a test for the same-named static functions crashing dependence graph in the goto-analyser
05f46a9 Fix the problem where two static functions with the same name would cause the dependency graph to fail.

git-subtree-dir: cbmc
git-subtree-split: d190fd8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants