Skip to content

Corrections to the string solver - 2 #646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

allredj
Copy link
Contributor

@allredj allredj commented Mar 16, 2017

The code adding lemmas to the solver for each equation has been completely remastered. It has been simplified and should now cover all foreseeable cases.
We add constraints on sums for avoiding overflows in the string solver.
Many fixes have been done in the preprocessing of strings. String refinement is now more language-independent.

This replaces pull request #574.

This depends on #378 and #559.

This avoid dependencies between goto-programs and solver.

We also removed function is_unrefined_string_type which should not be needed in the string solver.

Removed also mention of java string types and unrefined types in the
solver. These mentions should
not be necessary, since we are not supposed to do anything java
specific. The solver should only have to deal with refined string
types and possibly char arrays.
This is a more appropriate location for this module since it's used
both in the preprocessing of goto-programs and the string solver.
Refined string type should be used instead as we are trying to be
language independent.
The class_name is now passed as argument so that we can reuse this
function for StringBuilder and other "String-like" classes.
@allredj allredj force-pushed the string-refine-corrections-on-preprocess branch from b65d6f8 to 03c5660 Compare March 22, 2017 15:27
@allredj allredj force-pushed the string-refine-corrections-on-preprocess branch from 03c5660 to 327c89c Compare March 23, 2017 16:29
Better signatures and string detection

Better detection of string-like types and handling of char arrays

Changing the way we deal with string initialisation from array
Factorized part of type recognition and added StringBuilder and
CharSequence to the list of java classes that should be
considered as string.

Changed the way we deal with StringBuilders: instead of having a map
we add an assignment to the instructions.

We also detect char arrays and handle them better.
We now use substring and copy functions for initialisation from char
array since the argument are always
transformed into refined strings.

For each string returned by a string function we also add into the
java_string_to_cprover_string map a string_exprt.

Corrected detection of typecast in make_cprover_string_assign

Ensuring refined_string arguments of function applications are string_exprt

Correct string_refine_preprocesst constructor.
Order of initialisation is now the same as the order of
declaration.
This was picked up by g++ and clang.

Added signatures for some StringBuilder functions.

Removed map java_to_cprover_string and adapt signature for
side effects.
The usage of a map is not correct since strings can be modified by
side effects. Signature is necessary for StringBuilders to be assigned
in the right way with methods with side effects.

Assign all string_exprt to refined string symbols in preprocessing.
This makes it then easier to debug and to find
witnesses for counter examples in the solver.

Make signatures take priority over actual type and add signature
for intern.

Linting corrections

Adding malloc for char arrays for String.toCharArray

Fixing preprocessing of string function with side effect

This fixes problems we were getting with some StringBuilder functions.
The return value should contain a pointer to the original
StringBuilder and the fields of the StringBuilder should be filled
with the result of the function call.

Corrected mistake in preprocessing of string functions with side effects

char array assignements returns a string_exprt

This is to be uniform with other preprocessing functions also
returning string_exprt

Preprocessing for StringBuilder.append on char array

Corrected update of signature and commented out some unused code

Corrected the initialization from char array

We cannot use the substring function because the meaning of the third
argument is different between `String.<init>([CII)` and
`String.substring(SII)`

Using new id for conversion between char pointer and array

Cleaning of preprocessing for strings

Removed useless functions and merged some maps

Make a copy of the function call to be modified

Removed redeclaration of location
Overhaul of string solver contributions

The code adding lemmas to the solver for each equality has been
completely remastered. It has been simplified and should
now cover all foreseable cases.

 Add overflow constraints on sums for string solver
 When creating sum expressions during string
refinement, e.g. when
 adding string lengths, we shall add axioms to prevent the solver
 from finding values that actually come from an integer
overflow.
 Therefore, we introduce a new method plus_exprt_with_overflow_check()
 that wraps around plus_exprt and generates the
relevant axioms.

Cache function applications
Cache converted function applications to avoid re-generating the axioms
if it has already been done for a given function application.
This seems to happen for function applications that are in an
assertion, and thus are treated separately by convert_bitvector.

Remove string_refinement::convert_symbol: the behaviour of
this function was to convert java strings to strings of
symbols. In hindsight, this was not a good idea, as we still refer
to the actual java string fields in some cases, e.g. to read its
length.

Replace add_axioms_for_string_expr
In all add_axioms_... methods, we replace all calls to
add_axioms_for_string_expr by the newly created function
get_string_expr, which simply returns a
string_exprt when we give it
an refined expr (such as a string_exprt or a symbol).

Update doc in add_axioms_for_index_string
Simplify constraint for is_prefix: apply distribution
law and update (and refactor) documentation.

Remove set_string_symbol_equal_to_expr()

Removed mentions of java string type in the string solver

Removing mentions of is_c_string_type
To use the string solver in C, we
should use a struct type with tag
    __CPROVER_refined_string_type

Keep non-string axioms in a list
Instead of giving axioms to
super::boolbv_set_equality_to_true, which
may contain un-substituted symbols, we keep them in a list and
substitute them in dec_solve() before giving them to the
function.

Cleaning the add_axioms_for_string_expr method and renamed it
to add_axioms_for_refined_string to signify that it should
 only be called on expression of type refined string.
An assertion was added at the beginning to ensure that.

Better get_array and string_of_array functions.
Cleaned the implementation of get_array and factorized
part of it.
Made string_of_array directly take a array_exprt, which can be
 obtained from get_array.
Improved string_of_array for non printable characters

Resolve symbol to char arrays in non-string axioms.
We introduce a mapping that expresses the
maximal traversal of symbols
of char array type.
We use that map to, either obtain a char array from a symbol,
either get a unique symbol the aliases equivalence class.
This map is used to replace all symbols of char array type in
the axioms that are added using the parent method:
        supert::boolbv_set_equality_to_true.

Defining several intermediary functions for
check_axioms and cleaning.
Check axioms is quite complicated so we splitted it in several
parts.

Avoid using same universal variable name in string comparison function.

Corrected test in index_of as characters could be signed

Corrected return type retrieval in
from_float functions.
The return type should be given as argument of the helper function
add_axioms_from_float.

Fixed type of contains in constraint generation

Introduce string_refinementt::set_to.
We now override the set_to() method
instead of set_equality_to_true().
This solves a problem where many lemmas were being dropped.

Handle array_of in set_char_array_equality.
Java assignments such as char[] str=new char[10]
involves assigning
the char array to 0, which is done using the array_of operator.
This operator is now handled when found on the rhs of a char
array assignment.

Optimized string hash code and intern functions to only
look at seen strings.
We only compare the argument to the strings on which hash_code
(resp. intern) was already called.

Add missing override keyword in declarations

Integrate strings into symbol resolution.
We now use for strings the mechanism that was
made for resolving symbols
    to char arrays.
  As a result, the symbol_to_string map has been removed.
 We introduce an unresolved_symbol map to associate symbols
to string
    expressions that were introduced during constraint generation.

 Preventing return of nil exprt in sum_over_map

Enforce -1 for witnesses when strings are unequal.
This makes it easier to see when models for string
have different length.

Correct array index in get_array
Correct a bad array access that caused a segmentation fault on
the java_delete test.

Adding option to concretize result and get method in string solver
The option to concretize makes sure that the model that we get in the
end is correct.
Overiding the get method is necessary to get the actual valuation of
string gotten by the solver.
This two additions makes the trace generation for program with strings
more likely to be actual traces of the program.

Avoid adding axioms for copy

Corrected `get` to not replace strings

Simplifying lemmas before handing them to the solver
This seems to improve performances

Ignoring char array that are not symbol or constants

Avoid creating new length variable for concat

Adding constraint on positive length for refined strings

Using get method of the parent class for the size

Add function to fill the found_length map

Signature for get_array should be constant

Corrected overflow problem when integer get to the max size

Removing overflow check in universal constraint

Enforcing witness -1 when length is not sufficient in suffix

Corrected index variable which should be universally quantified

Factorizing addition to index set and more debug information

Avoiding generating a fresh string for the empty string

Enabling string solver to treat more cases and more debug infos

Conversion from char pointer to array in the solver

Raise a warning if found length is negative

Ensure the arguments of parseInt are of the right format
For now we add axioms asking for the argument to have the right format
but ultimately we should raise an exception when it is not the case
(which is not done right now).

Corrects the specification of int to string conversion
Some problems where encountered for strings with the maximal size
possible for an int, which could cause an overflow.

Disallow + sign in string from int
Java will not add the sign for positive numbers

Use get instead of current_model in check_axioms

Streamline code of check_axioms() by calling get() insteand of relying
on the 'current_model' variable.
get() has been adapted to convert array-lists into with expressions, the
former not being handled by the string solver.
@allredj allredj force-pushed the string-refine-corrections-on-preprocess branch from 327c89c to e93cd40 Compare March 23, 2017 17:09
@forejtv forejtv requested a review from peterschrammel April 20, 2017 10:28
@kroening
Copy link
Member

Needs a rebase, but is otherwise ready to go.

@tautschnig
Copy link
Collaborator

I note that #378, which this one allegedly depends on, is still open.

@allredj
Copy link
Contributor Author

allredj commented Jun 19, 2017

Indeed, the present PR corrects some issues with PR #378. However the latter still hasn't been merged because we put a halt on integrating string support into the master branch of cbmc. As such, I believe it makes little sense to get these PRs ready to be merged in master, as we already know that this implementation has some problems. These problems have however been addressed in the test-gen-support branch, via a compete overhaul of the preprocessing, as well as many other corrections/improvements.
I would recommend completely dropping the outstanding string-related PRs on the master branch.
Introducing a viable string support in cbmc master could be done by merging it from the test-gen-support branch. It would not be completely straightforward, but would certainly be a more satisfactory solution, I believe.

@tautschnig
Copy link
Collaborator

I would recommend completely dropping the outstanding string-related PRs on the master branch.

Marking this as do-not-merge; I'll leave it to you/others to close this PR (and related ones) entirely.

@peterschrammel peterschrammel removed their request for review July 5, 2017 20:44
@peterschrammel
Copy link
Member

Closing this PR for above mentioned reasons.

@allredj allredj deleted the string-refine-corrections-on-preprocess branch June 14, 2018 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants