-
Notifications
You must be signed in to change notification settings - Fork 273
TG-58 Complete rework of string solver to avoid using infinite arrays #1420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TG-58 Complete rework of string solver to avoid using infinite arrays #1420
Conversation
7148616
to
59fc1bd
Compare
6371f41
to
aca00ce
Compare
330d8da
to
4e6fd78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed about a quarter of this. It's just too big. Don't have time to leaf through the rest; suggest finding another reviewer to pick up where I left off, or find a way to split the huge 3000-line commit in the middle of this into stages so each can be reviewed individually.
{ | ||
StringBuilder sb=new StringBuilder(""); | ||
for(int i=0; i<arr.length; i++) | ||
sb.append(arr[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad indentation in this function
char arr[]=new char[s.length()]; | ||
// We limit arbitrarly the loop unfolding to 5 | ||
for(int i=0; i<length && i<5; i++) | ||
arr[i]=s.charAt(i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
@@ -9,8 +9,10 @@ public static String main() | |||
{ | |||
s.append((String) t[0]); | |||
} | |||
assert(s.toString().equals("Hello world!")); | |||
assert(!s.toString().equals("Hello world!")); | |||
if(i==0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indents
String s = sb.toString(); | ||
System.out.println(s); | ||
assert(s.equals("abcd")); | ||
if(i==0) | ||
assert(s.equals("abcd")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indents
// We limit arbitrarly the loop unfolding to 10 | ||
for(int i=0; i<length && i<10; i++) | ||
arr[i]=s.charAt(i); | ||
return arr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent
src/util/string_expr.h
Outdated
operands().resize(2); | ||
} | ||
|
||
refined_string_exprt(const exprt &_length, const exprt &_content, typet type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const typet &
?
@@ -234,13 +234,13 @@ exprt string_constraint_generatort::associate_array_to_pointer( | |||
const auto &length=array_expr.length(); | |||
if(length==infinity_exprt(length.type())) | |||
{ | |||
auto ret=length_of_array_.insert( | |||
length_of_array_.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're expecting this to be a fresh insert, check insert
's return value
std::make_pair(array_expr, fresh_symbol("string_length", length.type()))); | ||
array_expr.length()=length_of_array_[array_expr]; | ||
} | ||
|
||
// TODO should use a function for that | ||
auto ret=arrays_of_pointers_.insert(std::make_pair(pointer_expr, array_expr)); | ||
arrays_of_pointers_.insert(std::make_pair(pointer_expr, array_expr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, check insert's return value
@@ -545,11 +545,12 @@ codet initialize_nondet_string_struct( | |||
code_blockt code; | |||
|
|||
// `obj` is `*expr` | |||
// const dereference_exprt obj(expr, expr.type().subtype()); | |||
const irep_idt &class_id= | |||
const struct_typet struct_type= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always follow; take return by const ref
@@ -33,6 +33,20 @@ TEST_CASE("Generate string object") | |||
bool failed=symbol_table.add(jlo_sym); | |||
CHECK_RETURN(!failed); | |||
|
|||
|
|||
// Add java.lang.Object | |||
symbolt new_symbol; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to do this -- jlo is always in the class loader's queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we have a class loader here (did you see we are in a unit test?)
e63e2e9
to
2bf15cd
Compare
1873cf3
to
e34064f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, I need to better understand more of the string refinement changes to comment on structure.
@@ -26,10 +26,34 @@ Author: Alberto Griggio, [email protected] | |||
#include <solvers/sat/satcheck.h> | |||
#include <solvers/refinement/string_constraint_instantiation.h> | |||
#include <java_bytecode/java_types.h> | |||
#include <util/optional.h> | |||
|
|||
#include <util/ssa_expr.h> // for char array pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these appear to be unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, there clearly is a usage of optional in the file for example. Question is whether we should start removing headers if they were already included in parent file. Or make some sort of stdafx.h
precompiled header file, which would include most commonly used headers (STL utilities, containers, strings, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went to go look in CODING_STANDARD.md
. It's not completely clear on this type of case, but it does say avoid unnecessary #includes
, which they are in the literal sense.
exprt union_find_replacet::add_symbol(const exprt &lhs, const exprt &rhs) | ||
{ | ||
const exprt lhs_root=find_expr(lhs); | ||
const exprt rhs_root=find_expr(rhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const &
for both
return rhs_root; | ||
} | ||
|
||
static std::vector<exprt> instantiate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be deleted, other declaration below.
@@ -87,13 +111,18 @@ static exprt instantiate( | |||
const exprt &val); | |||
|
|||
static std::vector<exprt> instantiate( | |||
messaget::mstreamt &stream, | |||
const namespacet &ns, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can delete these added args, they are unneeded in the function.
@@ -249,327 +280,315 @@ static std::vector<exprt> generate_instantiations( | |||
for(const auto &nc_axiom : axioms.not_contains) | |||
{ | |||
for(const auto &instance : | |||
instantiate(nc_axiom, index_set, generator)) | |||
instantiate(stream, ns, nc_axiom, index_set, generator)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can delete first two args.
const exprt &i=cur.op1(); | ||
|
||
bool has_quant_var=find_qvar(i, qvar); | ||
const index_exprt index_expr=to_index_expr(cur); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be &
.
@@ -1712,25 +1802,34 @@ static void initial_index_set( | |||
to_process.pop_back(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loop could use expression iterator.
/// \return the lemmas produced through instantiation | ||
static std::vector<exprt> instantiate( | ||
messaget::mstreamt &stream, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unneeded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite the opposite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It wasn't needed.)
@@ -1874,15 +1973,22 @@ static exprt instantiate( | |||
|
|||
/// Instantiates a quantified formula representing `not_contains` by | |||
/// substituting the quantifiers and generating axioms. | |||
/// \param stream: a message stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this by removing unused arguments and updating index_set
(and removing current_index
set).
it->second); | ||
} | ||
else if(ecopy.id()==ID_struct) | ||
if(ecopy.type().id()==ID_array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be all arrays? It used to just be character arrays. For instance, the skolem functions in not contains constraints are arrays. This code seems to assume the arrays are strings.
a7df645
to
d8d91e9
Compare
One thing I think we should address with such a big change to the string refinement process is stop the solver from treating skolem functions as strings. If I run the solver on |
01e6dd2
to
dad57d7
Compare
The problem is that it's very difficult to find any significant intermediary stage in this big commit that wouldn't break all the string tests. |
} | ||
error() << "dec_solve: current index set is empty, " | ||
<< "this should not happen" << eom; | ||
return resultt::D_ERROR; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romainbrenguier Changed the behavior of the solver here, probably not on purpose. In a later commit to revert to what was originally there for the most part, but not completely.
const exprt arr_model=simplify_expr(*arr_model_opt, ns); | ||
const exprt concretized_array=concretize_arrays_in_expression( | ||
arr_model, generator.max_string_length, ns); | ||
debug() << "get " << from_expr(ns, "", expr) << " --> " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was debugging something with Justing, and this debug output and the one below make this impossible, get is called too often. Please delete.
2fa3500
to
ef9aeaf
Compare
Remove duplicate code for add_axioms_for_insert_float. Style: indentation in add_axioms_for_insert. Remove useless comment.
Remove unused function: associate_char_array_to_char_pointers. Rename solver in symbol_resolve. Rename function in generate_symbol_resolution_from_eq. Rename function get_char_array_and_concretize. Documentation update for generate_symbol_resolution, add_lemma, substitute_function_application, is_char_type. Some documentation and style fixes.
Debug information for arrays of pointers Debug information for check_axioms steps Refactoring: put debug information out of check_axioms Replace warning by invariant
This is necessary for the negation procedure to know the existential bounds and unfold the quantification as a conjunction.
This is in the case where super_get does not anything about the array we give it. This is necessary so that the string solver returns an array of the correct size, if the size is the only thing we know about the string. Add warning message in the case where the symbol is unknown to super_get
This is done by looking at which arrays were associted to pointers in the generator.
We use this in get to distinguish between strings and arrays of characters
This removes unused functions, declaration of unimplemented function, unused constants and unused fields.
In initialisation of string objects, this prevents the created objects from getting out of scope.
The function what was there before was not correct. It was calling cprover_string_copy which is deprecated.
The check for a specific object number was removed so this does not check what we want anyway. It is difficult to make it check the original intended behaviour using only regular expressions.
This more accurately describe what this object is.
27c1f83
to
3ee3b25
Compare
@peterschrammel I squashed a lot of the correction commits with the corresponding original commit, I don't think there is much more to be done. For the cpp linter failure, this is due to a disagreement with what clang-format does for brackets in lamdbas. |
Would |
Add nolint marker on line with lambdas, which are formatted by clang in a way that is not compatible with cpp-lint.
dfc6942
to
aa94fe8
Compare
This should be replaced by #1539 |
This makes the string solver algorithm work only on finite size arrays. The input program may have pointers and infinite arrays in which case the solver associate finite arrays to them.
The type for the String class in java-bytecode is changed to contain a char pointer instead of a pointer to an infinite array, however infinite array are still present there for non-deterministic assignment.