-
Notifications
You must be signed in to change notification settings - Fork 274
String solver: stop relying on array size #4626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String solver: stop relying on array size #4626
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to "Get rid of created_strings_value". No blocking request so far.
src/solvers/strings/array_pool.cpp
Outdated
@@ -32,6 +32,18 @@ exprt array_poolt::get_length(const array_string_exprt &s) const | |||
return s.length(); | |||
} | |||
|
|||
/// As opposed to get_length(), do not create a new symbol if the length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ \ref
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is part of the public interface so should be documented in the header file 🍇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\ref
is unnecessary. Doxygen understands where to link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc moved to header file
src/solvers/strings/array_pool.cpp
Outdated
@@ -21,15 +21,14 @@ operator()(const irep_idt &prefix, const typet &type) | |||
return result; | |||
} | |||
|
|||
exprt array_poolt::get_length(const array_string_exprt &s) const | |||
/// Get the length of an array_string_exprt from the array_pool. If the length | |||
/// does not yet exist, create a new symbol and add it to the pool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc moved to header file
src/solvers/strings/array_pool.cpp
Outdated
@@ -21,15 +21,14 @@ operator()(const irep_idt &prefix, const typet &type) | |||
return result; | |||
} | |||
|
|||
exprt array_poolt::get_length(const array_string_exprt &s) const | |||
/// Get the length of an array_string_exprt from the array_pool. If the length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ typo in commit message: "accesssing"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/solvers/strings/array_pool.cpp
Outdated
@@ -46,8 +46,7 @@ array_poolt::get_length_if_exists(const array_string_exprt &s) const | |||
array_string_exprt | |||
array_poolt::fresh_string(const typet &index_type, const typet &char_type) | |||
{ | |||
symbol_exprt length = fresh_symbol("string_length", index_type); | |||
array_typet array_type(char_type, length); | |||
array_typet array_type(char_type, infinity_exprt(index_type)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ brace initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/solvers/strings/array_pool.cpp
Outdated
exprt array_poolt::get_length(const array_string_exprt &s) | ||
{ | ||
if(const auto &if_expr = expr_try_dynamic_cast<if_exprt>((exprt)s)) | ||
{ | ||
return if_exprt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ brace initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/solvers/strings/array_pool.cpp
Outdated
length_of_array.emplace(array_expr, size_from_type); | ||
INVARIANT( | ||
emplace_result.second, | ||
"attempt_assign_length should only be called when no entry for the" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attempt_assign_length_from_type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
src/solvers/strings/array_pool.cpp
Outdated
@@ -65,6 +65,10 @@ array_poolt::fresh_string(const typet &index_type, const typet &char_type) | |||
created_strings_value.insert(str); | |||
arrays_of_pointers.emplace( | |||
address_of_exprt(index_exprt(str, from_integer(0, index_type))), str); | |||
|
|||
// add length to length_of_array map | |||
get_length(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ maybe worth defining a void function to avoid having to write that comment twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in a later PR.
const std::set<array_string_exprt> &array_poolt::created_strings() const | ||
/// Return a map mapping all array_string_exprt of the array_pool to their | ||
/// length. | ||
const std::unordered_map<array_string_exprt, exprt, irep_hash> & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 exposing the whole map here seems unnecessary since we are only using the first component. An alternative would be to define a for_each_string
function with takes a function as argument and apply it to each array_string_exprt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it like this for now so that the PR doesn't blow up.
{ | ||
const exprt length = super_get(arr.length()); | ||
exprt length = super_get(length_from_pool.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to remove the const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-introduced
@@ -105,7 +105,8 @@ static optionalt<exprt> get_array( | |||
const std::function<exprt(const exprt &)> &super_get, | |||
const namespacet &ns, | |||
messaget::mstreamt &stream, | |||
const array_string_exprt &arr); | |||
const array_string_exprt &arr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ commit message doesn't say why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added description
@@ -965,7 +965,11 @@ static optionalt<exprt> get_array( | |||
const array_string_exprt &arr, | |||
const array_poolt &array_pool) | |||
{ | |||
const exprt &size = arr.length(); | |||
const auto &size_from_pool = array_pool.get_length_if_exists(arr); | |||
exprt size = size_from_pool.has_value() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to remove the const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const re-introduced
@@ -493,8 +498,9 @@ std::pair<exprt, string_constraintst> add_axioms_for_format( | |||
is_refined_string_type(arg.type()), | |||
"arguments of format should be strings"); | |||
const array_string_exprt string_arg = get_string_expr(array_pool, arg); | |||
get_arg = [string_arg](const irep_idt &id) { | |||
return format_arg_from_string(string_arg, id); | |||
/// REMOVE get_arg below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgotten comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
1973e15
to
f7f8ad2
Compare
Will look at this tomorrow, but do you intend it to remove the illegal |
f7f8ad2
to
81edbb1
Compare
src/solvers/strings/array_pool.cpp
Outdated
@@ -32,6 +32,18 @@ exprt array_poolt::get_length(const array_string_exprt &s) const | |||
return s.length(); | |||
} | |||
|
|||
/// As opposed to get_length(), do not create a new symbol if the length | |||
/// of the array_string_exprt does not exist, but instead return an empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps clarify what you mean by a length not existing? Is that any non-constant length, or the length of a string we haven't fully initialised yet, or...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved the documentation. This now states:
/// As opposed to get_length(), do not create a new symbol if the length
/// of the array_string_exprt does not have one in the array_pool, but instead
/// return an empty optional.
return it->second; | ||
} | ||
return s.length(); | ||
auto emplace_result = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this allocates a symbol even if s
already has a length, I'd suggest inserting a dummy and then replacing it with a fresh_symbol
if appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in fixup commit
src/solvers/strings/array_pool.cpp
Outdated
std::unordered_map<array_string_exprt, exprt, irep_hash> &length_of_array, | ||
symbol_generatort &symbol_generator) | ||
{ | ||
if(const auto &if_expr = expr_try_dynamic_cast<if_exprt>((exprt)array_expr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the explicit cast (and to a non-reference-type?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the (exprt)
? Curiously exprt_try_dynamic_cast
doesn't work directly on array_string_exprt
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, this is obvious -- an array_string_exprt
is certainly not an if_exprt
, this can't ever be run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one other case with the same cast and the same unreachable code. This suggests writing tests to exercise this code!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's unreachable. array_string_exprt
s can have ID_if
AFAIK. However you make a good point with the test coverage, as we don't seem to have anything to cover these lines so for now I might just put an assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in this place i replaced the if with an assertion because I can't find any test to cover this part. However in the other place where I use the condition, I left it in because it's often the case that the array_string_exprt
is an if. Therefore that code is well covered by regression tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, my mistake, I thought array_string_exprt
constrained the expression to have a particular id, and therefore not be an if_exprt
, but actually it just constrains the type of a general expression. The implementation of the expr-casting stuff, though, assumes this sort of trait class doesn't exist, and so casts are only possible to children of the current type. That ought to be fixed, but is obviously out of scope for this PR, so approving.
src/solvers/strings/array_pool.cpp
Outdated
} | ||
else | ||
{ | ||
length_of_array.emplace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should check this one is new, too? Therefore, deduplicate the emplace
calls and the invariant below this if block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in fixup commit
src/solvers/strings/array_pool.cpp
Outdated
@@ -65,6 +65,10 @@ array_poolt::fresh_string(const typet &index_type, const typet &char_type) | |||
created_strings_value.insert(str); | |||
arrays_of_pointers.emplace( | |||
address_of_exprt(index_exprt(str, from_integer(0, index_type))), str); | |||
|
|||
// add length to length_of_array map | |||
get_length(str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't normally expect a get_
function to have a side-effect like this, suggest renaming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be happy with get_or_create_length
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
@@ -76,6 +76,8 @@ class array_poolt final | |||
|
|||
private: | |||
/// Associate arrays to char pointers | |||
/// Invariant: Each array_string_exprt in this map has a corresponding entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this enforced?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved documentation. Now states:
/// Associate arrays to char pointers
/// Invariant: Each array_string_exprt in this map has a corresponding entry
/// in length_of_array.
/// This is enforced whenever we add an element to
/// `arrays_of_pointers`, i.e. fresh_string()
/// and make_char_array_for_char_pointer().
3723a52
to
9fb4c37
Compare
@smowton About your general comment. Yes, removing the conditional types is the idea of that PR. |
9fb4c37
to
bde860c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see those conditionals in types go away! I'd just ask for the commit history to be cleaned up with all those fixup commits. Doing so will then also re-trigger the codebuild task.
@tautschnig Thanks. Of course I'll merge the fixup commits before merge. I have some failing tests on TG, so I need to fix that first anyway. |
bde860c
to
6071d79
Compare
The length of the backing array of strings should not be read, so no need to create a length symbol for the type when creating a new string.
Lengths of array_string_exprt can be any exprt, not just symbol_exprt. In particular, we want to be able to give the length coming from the backing array, which is not necessarily a symbol.
All created strings should have an entry in arrays_of_pointers.
Function to add the length of the string backing array to the length_of_array map in the array_pool. If the size does not exist, create a symbol. Function will be used to assign an initial length to the array_string_exprt, using the type information. After that initial assignment, there will be no further reference to the length of the underlying array.
Remove handling of if_case, as I cannot find a test case that exercises it.
Because we can.
This invariant ensures that we get all the strings of the array_pool when iterating over length_of_array, allowing to get rid of the created_strings_value map.
Using the newly adding invariants, created_strings() now returns the length_of_array map, which contains all array_string_exprts of the array_pool. This map is more complete than created_strings_value.
No longer used. array_poolt.find() can be made to return the array_string_exprt directly
In string_refinement::get(), we now get the length of the string from the array_pool instead of the array type. We use get_length_if_exists() instead of get_length(), because we want to ignore if the length does not exist, instead of creating a new symbol.
String length will be queried from array_pool in future commit.
Avoid reading from array size.
Argument will be required to query string lengths.
Replace calls to str.length() by calls to array_pool.get_length(str).
We require a constant value in this case and querying the array pool gives us a symbol.
Length is now always read from array_pool
Simplifies the interface by passing that string argument directly rather than the function that retrieves it.
size exprt is not meaningful in all cases. This is necessary for upcoming refactoring.
If the size of the array is not provided by the model, get the size directly from the array. Note that the non-constant case now returns an empty optional rather than an unknown exprt.
Checks that the string literal is present in the trace even if it is not part of the decision. As a side-effect, this checks that the length is properly assigned to the string, even if it cannot be retrieved from the solver.
d622019
to
cd61e81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: cd61e81).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/113725722
The string's backing array size (from its type) initially informs about the length of the string. However during refinement, the length can vary and if the length is tied to the size of the array, we need to rewrite the type as a conditional, which should be avoided (both the rewriting and the conditional).
This PR refactors string refinement so that during constraint generation, we refer to the size of the string solely through the array pool.
The size of a string thus always refers to a symbol, which may or may not be constrained.
The only time we get the array size is when inserting an array_string_exprt in the array pool (i.e. at the creation of the array_string_exprt).
There shouldn't be any visible change in JBMC, but that should fix some errors that have been happening with test-generation. I will keep looking to see if I can reproduce on JBMC.
I will also check that performance is not affected.