-
Notifications
You must be signed in to change notification settings - Fork 273
Improve cprover_string_literal argument handling TG-1245 #1691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve cprover_string_literal argument handling TG-1245 #1691
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if has_string_subtype
is doing what you want to, could you clarify the intent, perhaps by giving some examples of types you expect to pass and fail has_string_subtype
's test?
} | ||
} | ||
|
||
for(const auto &t : type.subtypes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't mean subtype in the Java sense, but rather the arguments of a type in the algebraic datatype sense -- for example, the "subtypes" of a struct are the types of its components (therefore the struct special case above is redundant), the "subtype" of a pointer is the type it points to... and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
therefore the struct special case above is redundant
I'm not sure it is: in a struct the subtypes appear in the components, which is a named_sub
for the irept
and not in sub
which is what subtypes
returns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added to the documentation the precision that we mean subtype in the algebraic sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah so it is. No problem then.
887aa92
to
0f0cf97
Compare
/// \param res: string expression for the result | ||
/// \param arg: expression of type string typet | ||
/// \param guard: condition under which `res` should be equal to arg | ||
/// \return 0 if constraints were added, 1 if expression could not be handled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is with the return value of if_exprt
? This reads as if only 1 or 0 can be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in case of if_expr we return the maximum of the true case or false case, so it will always be 0 or 1
{ | ||
if(has_string_subtype(t, ns)) | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would for the above two loops using std::any_of
be easier to understand?
c595a2d
to
781ecdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions for reducing code duplication, but I think the important thing is there should be some tests.
@@ -312,7 +312,11 @@ bool is_char_pointer_type(const typet &type) | |||
|
|||
/// \param type: a type | |||
/// \param ns: name space | |||
/// \return true if a subtype is an pointer of characters | |||
/// \return true if a subtype of `type` is an pointer of characters. | |||
/// The meaning of "subtype" is in the algebraic datatype sense: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be being dense, but I don't completely get this, perhaps an example (e.g. a pointer to a char or struct with a char component)
@@ -336,6 +340,39 @@ static bool has_char_pointer_subtype(const typet &type, const namespacet &ns) | |||
return false; | |||
} | |||
|
|||
/// \param type: a type | |||
/// \param ns: name space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no space between name and space
/// etc... | ||
static bool has_string_subtype(const typet &type, const namespacet &ns) | ||
{ | ||
if(type == string_typet()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pull the common code between this method and has_char_pointer_subtype
into a separate method (has_type_have_subtype(const typet & type, const namespacet &ns, const typet &required_subtype)
)
/// for example, the subtypes of a struct are the types of its | ||
/// components, the subtype of a pointer is the type it points to, | ||
/// etc... | ||
static bool has_string_subtype(const typet &type, const namespacet &ns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a couple of unit tests for this if you don't mind.
} | ||
else if(const auto constant_expr = expr_try_dynamic_cast<constant_exprt>(arg)) | ||
return add_axioms_for_constant(res, constant_expr->get_value(), guard); | ||
else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment explaining why in this case a 1
expression is a reasonable return might be good.
const exprt &arg = args[2]; | ||
irep_idt sval=to_constant_expr(arg).get_value(); | ||
return add_axioms_for_constant(res, sval); | ||
return add_axioms_for_cprover_string(res, args[2], true_exprt()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to see a test for whatever case is now supported. I can't say the changes in this file mean much to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be tests in other repositories: https://github.com/diffblue/test-gen/pull/1327
But it is difficule to test on CBMC because we don't have a model for Class and the main affected method is Object.getClass.
An example that should work afterwards is the following:
public class testMaybeNull1
{
public Object x;
public String check()
{
return x.getClass().getName();
}
}
The reason is that getClass converts a cprover_string to a java string (the @class_identifier
), but since the object could be null, the argument passed to cprover_string_literal
is more complicated than a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not be sufficent to have a:
public class simpleClaz {
String getStr() { return "hello"; }
}
/////////////
public class testMaybeNull1
{
public simpleClaz x;
public String check()
{
return x.getStr();
}
}
Since x may or may not be null isn't it the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No this example is already handled correctly. It is the @class_identifier
field that is a problem, and as far as I now it can only be access via instanceof
and getClass
. And I think getClass
is the only function which generate a cprover_string_literal
with something else than a constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is difficule to test on CBMC because we don't have a model for Class and the main affected method is Object.getClass.
Getting 'java core models' into the cbmc repo would help. @cesaro has started some work on that.
if(type == string_typet()) | ||
return true; | ||
|
||
if(type.id() == ID_struct || type.id() == ID_union) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment got hidden but I think still applies:
Can you pull the common code between this method and has_char_pointer_subtype
into a separate method (has_type_have_subtype(const typet & type, const namespacet &ns, const typet &required_subtype)
)
2d29ed2
to
d5a7b70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - looks good, just a couple of very small nits but no need for re-review
return true; | ||
|
||
// clang-format off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have been so far using clang-formats style where it disagrees with the linter (And using // NOLINTNEXTLINE
)
/// for example, the subtypes of a struct are the types of its | ||
/// components, the subtype of a pointer is the type it points to, | ||
/// etc... | ||
/// For instance in the type `t` defined by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 👍
// clang-format off | ||
return has_subtype( | ||
type, | ||
[](const typet &subtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think you can just pass in is_char_pointer_type
rather than wrapping in predicate
return has_subtype(type, is_char_pointer_type);
060a2f8
to
d71d5dd
Compare
This allows to add a condition under which we want the result to be equal to the given constant. By default this is true, meaning that the result should always equal the constant.
cprover_string_literal was requiring a string constant as argument, but we now deal with if expressions as well. So expressions like: `cprover_string_literal_func!0#0(cprover_string_length$3!0@1#2, nondet_infinite_array$5!0@1, this!0@1#1 == &tmp_object_factory$17!0 ? tmp_object_factory$17!0#4.@class_identifier : this$object#0.@class_identifier)` are correctly handled by the string solver.
We take into account expressions of type string_typet in the symbol_resolve object, so that these values can be propagated. This can be particularly usefull for getting class identifiers as strings, since these are represented as expression type string_typet in the goto program. This also adds a has_subtype function with predicate as arg to generalize what has_char_pointer_subtype and has_string_subtype are doing.
We reuse has_subtype to simplify the function. Also get rid of the namespace argument that was not used.
d4bf503
to
57f0999
Compare
@thk123 @mgudemann @smowton the version you reviewed was causing some performance problem on some tests (on a private repository), so I improved the code to only consider string equations that are actually used in a function. Can you have a look at the 3 last commits? @peterschrammel can you have a look at this PR? |
const exprt &arg = args[2]; | ||
irep_idt sval=to_constant_expr(arg).get_value(); | ||
return add_axioms_for_constant(res, sval); | ||
return add_axioms_for_cprover_string(res, args[2], true_exprt()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is difficule to test on CBMC because we don't have a model for Class and the main affected method is Object.getClass.
Getting 'java core models' into the cbmc repo would help. @cesaro has started some work on that.
class equation_symbol_mappingt | ||
{ | ||
public: | ||
// Record index of the equations that contain a given expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expression
} | ||
|
||
/// Symbol resolution for expressions of type string typet. | ||
/// We record equality between this expressions in the output if one of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these
|
||
/// Symbol resolution for expressions of type string typet. | ||
/// We record equality between this expressions in the output if one of the | ||
/// function calls depend on them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depends
} | ||
} | ||
|
||
// transitively add all equation which depends on the equations to treat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equations depend
class equation_symbol_mappingt | ||
{ | ||
public: | ||
// Record index of the equations that contain a given expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expression
@@ -116,4 +116,9 @@ bool has_subtype( | |||
const typet &type, | |||
const std::function<bool(const typet &)> &pred); | |||
|
|||
union_find_replacet string_identifiers_resolution_from_equations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment that this is required to unit-test this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of nitpicks from the new commits
strings_in_equation[i].push_back(expr); | ||
} | ||
|
||
std::vector<exprt> find_expressions(const std::size_t &i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Little point taking a value type by const ref
{ | ||
const std::vector<exprt> strings_in_comp = extract_strings_from_lhs( | ||
member_exprt(lhs, comp.get_name(), comp.type())); | ||
for(const auto s : strings_in_comp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result.insert(result.end(), strings_in_comp.begin(), strings_in_comp.end())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed final 3 commits - looks reasonable
@@ -479,12 +481,118 @@ static union_find_replacet generate_symbol_resolution_from_equations( | |||
return solver; | |||
} | |||
|
|||
/// Symbol resolution for expressions of type string typet | |||
class equation_symbol_mappingt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be great to add a doxygen summary of what this mapping is for
return result; | ||
} | ||
|
||
/// \param eq: equation to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing summary comment
{ | ||
const std::size_t i = equations_to_treat.top(); | ||
equations_to_treat.pop(); | ||
for(const exprt &string : equation_map.find_expressions(i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit since these for loops spread over multiple lines please add braces for clarity
@@ -620,6 +752,12 @@ decision_proceduret::resultt string_refinementt::dec_solve() | |||
|
|||
const union_find_replacet string_id_symbol_resolve = | |||
string_identifiers_resolution_from_equations(equations, ns, debug()); | |||
#ifdef DEBUG | |||
debug() << "symbol resolve string:" << eom; | |||
for(const auto &pair : string_id_symbol_resolve.to_vector()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since body of loop over multiple lines please add braces for clarity
We number equations to make it easier to debug function manipulating equations, in particular string_identifiers_resolution_from_equations. We also output symbol resolution information for strings
For cprover strings, adding all the equations to the union_find_replace structure could sometimes become counter performant. Instead here we look at what string expression are need for the function calls and in which equations they appear.
d45677b
to
23c3561
Compare
d190fd8 Merge remote-tracking branch 'upstream/develop' into pull-support-20180112 5bd5962 Merge pull request diffblue#1667 from romainbrenguier/feature/type_cast 8ecb55a Merge pull request diffblue#1717 from smowton/smowton/feature/remove_virtual_functions_per_function 0d7310a Merge pull request diffblue#1691 from romainbrenguier/bugfix/getClass#TG-1245 2064849 Use type_checked_cast in boolbv_width c5fc351 Validate data in pointer_typet in to_pointer_typet 35905c6 Add can_cast_type, validate_type for pointer_typet b72bacf Define type_try_dynamic_cast and type_checked_cast 23c3561 Unit test for string symbol resolution c13e602 Adding only needed equations in symbol resolution ae4deff Debug information for string equations 912828d JBMC: Run remove-virtual-functions as each function is converted a711c64 Introduce mechanism for renumbering an individual GOTO program e308a32 Merge pull request diffblue#1679 from NlightNFotis/nondet_extra_test e6ceb91 Merge pull request diffblue#1724 from tautschnig/fix-visitor ea74bed Add extra test for nondet-static flag and arrays 4f74896 Use irept API, not implementation-level API c55b4a5 Merge pull request diffblue#1682 from martin-cs/fix/dependence-graph-namespace-lifespan 1a2c14b Merge pull request diffblue#1722 from diffblue/unsafe_type_constructors 957a568 Merge pull request diffblue#1677 from NlightNFotis/pb4_develop 9c5add4 remove deprecated constructors for three bitvector types c96e02a no longer use deprecated constructors for some bitvector types 954060e Add unit test for has_subtype 3dd3877 Refactor has_char_pointer_subtype with has_subtype 4699c13 Extend symbol resolution to string_typet 74144fc Handle if_exprt in add_axioms_for_string_literal c6c1b3f Add an optional guard to add_axioms_for_constant 933d635 Merge pull request diffblue#1716 from mgudemann/fix/null_check_for_java_instanceof 1659314 Merge pull request diffblue#1715 from smowton/smowton/cleanup/jbmc_unused_passes 9c457b7 Add regression test for null instanceof. 2080cd3 Complete instanceof for Java. d4300d0 Merge pull request diffblue#1697 from diffblue/nondet_symbol_exprt 1c68dd4 Merge pull request diffblue#1714 from tautschnig/c-library-strcat 44b5bae Merge pull request diffblue#1698 from thomasspriggs/tg1633 c4304ba JBMC: Remove C-only passes bb8cfaa C library: Check upper bounds in memset, memcpy, memmove 7d4984f C library: Implement strcat, strncat 2a5cea2 This introduces nondet_symbol_exprt, which is generated by symbolic execution in response to side_effect_expr_nondett 85193a0 Merge pull request diffblue#1694 from NathanJPhillips/feature/add-raw-lhs-to-trace d9122dc Merge pull request diffblue#1710 from NathanJPhillips/feature/remove_instanceof_per_function 092df69 Switch from custom file / path routines to Boost-filesystem c8821b2 Allow to remove instanceof when remove exceptions 94b7658 Don't pass iterators into function calls a9c4e4f Added regression tests 76318ce Protect extended trace behind a command line option 69b0ff1 Added base_name in comments for all symbols e86080a Add raw LHS irep field to trace output ddd1b7a Add remove_instanceof overload to remove from a particular instruction 1c227b7 Merge pull request diffblue#1660 from smowton/smowton/fix/lazy_methods_array_parameters ae89c94 Lazy loading: assume concrete generic parameter types are needed 80eb6a6 TG-1877: Include array pointer types in needed classes 1053e5f Fix for [TG-1633] Inner generic types have incorrect type identifier e2cda1a Merge pull request diffblue#1704 from tautschnig/fix-copy-paste ef4a65e Fix op1/op0 copy&paste typo 21ea31f Merge pull request diffblue#1702 from peterschrammel/goto-diff-java c4bc953 Merge pull request diffblue#1701 from peterschrammel/allow-instrument-jdk 2811363 Java regression test for goto-diff 43d2e09 Also reset fresh temporary symbol counter 9ef28f4 Compare relative goto target offsets eaf3a7d Get source location from symbol table ab59659 Allow instrumentation of java.* and org.cprover.* 6fbd59c Merge pull request diffblue#1631 from tautschnig/fix-pointer-minus 7c04b5c Merge pull request diffblue#1699 from NathanJPhillips/feature/reset-main-in-tests 5e0f186 Pointer difference over void* is difference over char* faf8f00 Merge commit 'a83b52cddbed22304372c276512c63701eb3aedb' into pull-support-20180104 8236db4 Merge pull request diffblue#1419 from peterschrammel/refactor/cover-instrument a580e27 Merge pull request diffblue#1689 from smowton/smowton/feature/get_this 591511a Allow callers of load_java_class to pass the name of the main function 1b86b27 Merge pull request diffblue#1687 from smowton/smowton/feature/class-hierarchy-dot fd2bf6a Merge pull request diffblue#1688 from smowton/smowton/feature/parameter_indices f570ce5 Merge pull request diffblue#1696 from smowton/smowton/fix/identical_struct_equality 61b0d6d Merge pull request diffblue#1666 from mgudemann/bugfix/removed_required_virtual_calls 3365054 Add regression test 2b6dc8b Resolve concrete function call if no implementation is found 3f1fd64 Add code_typet::get_parameter_indices 42cf61a Fix testing for empty line in test desc file 2090000 Fix missing newline at end of desc file e448d5f Fix unsatisfiable test line f7f033d String smoke tests: ensure no type mismatches are seen b627c3d Replace unsound struct-cast simplification 8fa42b3 Class hierarchy: add DOT output, unit tests 04f2faf Mark GOTO instructions with unresolved virtual calls aac181f Pass command line options via optionst b6fa3e8 Factorize source location initialization 8da5395 Document cover functions a7f0c3d Introduce cover instrumenter 873627a Split cover into several files 0fc08f3 Replace cover-function-only by cover-include-pattern 1f2102c Add code_typet::get_this 2801f0f Avoid crashing when --dependence-graph is used by correcting namespace scoping. acac776 Add a test for the same-named static functions crashing dependence graph in the goto-analyser 05f46a9 Fix the problem where two static functions with the same name would cause the dependency graph to fail. git-subtree-dir: cbmc git-subtree-split: d190fd8
cprover_string_literal was requiring a string constant as argument,
but we now deal with if expressions as well.
This also improve solver handling of expressions of type
string_typet
by recording the symbols in the symbol_resolve map.