-
Notifications
You must be signed in to change notification settings - Fork 273
[TG-2721] Use sparse arrays in string_refinementt::get #2009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TG-2721] Use sparse arrays in string_refinementt::get #2009
Conversation
2278fa3
to
189e161
Compare
639dcce
to
4f2748e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm so happy to see the concretization and plus_with_overflow
go!
I've just got a couple suggestions, not even nitpicks, the code looks really good as it is.
if(entries.back().second == expr.operands()[i]) | ||
entries.back().first = i; | ||
else if(entries.back().second.id() == ID_unknown) | ||
entries.back() = std::make_pair(i, expr.operands()[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a more concise way to write this:
entries.back() = { i, expr.operands()[i] };
for(const auto &pair : entries) | ||
{ | ||
if(pair.first >= index) | ||
return pair.second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's worth it, you could rewrite entries
to as std::map
and then use this function to do this in O(log n)
:
http://www.cplusplus.com/reference/map/map/upper_bound/
|
||
exprt return_code = from_integer(0, get_return_code_type()); | ||
|
||
if(intermediary_strings.size() == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(intermediary_strings.empty())
?
{ | ||
for(; current_index <= pair.first && current_index < size; ++current_index) | ||
array.operands().push_back(pair.second); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could remove current_index
and add an if statement in this loop:
if (array.operands().size() < size)
return array;
This should make this function a little bit more obvious.
msg << "consider reducing string-max-input-length so that no string " | ||
<< "exceeds " << MAX_CONCRETE_STRING_SIZE << " in length and make sure" | ||
<< " all functions returning strings are available in the classpath"; | ||
throw msg.str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of throwing strings. How about std::runtime_error(string)
or std::invalid_argument(string)
, if it's not too outlandish of course?
{ | ||
auto set = generator.get_created_strings(); | ||
if(set.find(arr) != set.end()) | ||
exprt length = super_get(arr.length()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const exprt
?
{ | ||
const auto &if_expr = expr_dynamic_cast<if_exprt>(current.get()); | ||
const exprt cond = get(if_expr.cond()); | ||
if(cond.is_true()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(get(if_expr.cond()).is_true())
- I like oneliners.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also use cond
with .is_false
after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
6b60801
to
ea88a83
Compare
@@ -124,7 +124,19 @@ class interval_sparse_arrayt final : public sparse_arrayt | |||
/// `extra_value`. | |||
interval_sparse_arrayt(const array_exprt &expr, const exprt &extra_value); | |||
|
|||
/// Initialize an sparse array from an array represented by a list of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a sparse array
@@ -124,7 +124,19 @@ class interval_sparse_arrayt final : public sparse_arrayt | |||
/// `extra_value`. | |||
interval_sparse_arrayt(const array_exprt &expr, const exprt &extra_value); | |||
|
|||
/// Initialize an sparse array from an array represented by a list of | |||
/// index-value pairs, and setting the default to `extra_value`. | |||
/// Indexes must be constant expressions, and negative indexes are ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When can negative indices occur?
return interval_sparse_arrayt(*array_expr, extra_value); | ||
if(const auto &with_expr = expr_try_dynamic_cast<with_exprt>(expr)) | ||
return interval_sparse_arrayt(*with_expr); | ||
if(expr.id() == "array-list") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A proper irep_id should be added for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's OK I'd prefer to delay that to anoter PR (prepared here romainbrenguier#7) because it will require other reviewers as it touch other parts of CBMC as well.
@@ -648,20 +648,24 @@ decision_proceduret::resultt string_refinementt::dec_solve() | |||
output_equations(debug(), equations, ns); | |||
#endif | |||
|
|||
// Dependencies is also used by get, so we have to use it as a class member |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dependencies
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment for commit message: more constraints added and decreased performance
return {}; | ||
std::ostringstream msg; | ||
msg << "consider reducing string-max-input-length so that no string " | ||
<< "exceeds " << MAX_CONCRETE_STRING_SIZE << " in length and make sure" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::to_string
should be sufficient
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but I don't see the advantage. I think streams are nicer in this kind of situations.
@@ -0,0 +1,11 @@ | |||
CORE | |||
Test.class | |||
--refine-strings --function Test.check --string-max-input-length 2000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long do these tests take?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my computer 0.7s for the first one and 0.1s for the second.
ea88a83
to
2ea33db
Compare
This avoid having to explicitly sort the entries, and make looking up for an element more efficient: logarithmic instead of linear.
If the content is kept between calls to dec_solve, some nodes can be duplicated in the graph, which leads to more constraints added and decreased performance.
We reduce the number of constraints by avoiding the copy of the last string.
This initialize a sparse array from the array expression we know how to deal with.
Detects when an overflow happens in the sum of two integers. This will be used in buitin functions for dealing with the overflow case.
This function is adding assumptions on the values of integers which may lead to contradiction. It is better to deal with overflows at the level of the specification of the builtin functions instead.
The obtained expression can be exponentially smaller, because sparse array representation avoids repetitions.
In the case of index_expressions this can use exponentialy less memory.
This ensures that arrays from the underlying solver are interpreted in a consistent manner in the solver (always using interval_sparse_arrayt).
This makes sure the way we interpret arrays is consistent even in debugging functions.
This will allow to remove fill_in_array_expr which duplicates what concretize does.
This is now unecessary as get_array takes care of the concretization using sparse arrays. This allows use to delete concretize_arrays_in_expression which is now unused. Equivalent behavior to this function can be obtained using sparse arrays.
We should use sparse arrays as in the solver.
2ea33db
to
64b336a
Compare
7202003 Merge branch 'develop' of github.com:diffblue/cbmc into CBMC_subtree_2018-04-10 768e8a6 Merge pull request diffblue#2009 from romainbrenguier/solvers/sparse-arrays-in-get 69f6e7b Merge pull request diffblue#2032 from tautschnig/replace-rename-performance ec43b00 Merge pull request diffblue#2026 from tautschnig/sat-clean 64b336a Refactor interval_sparse_array::concretize 5392645 Reserve size of array in concretize 8277e92 Stop using concretize_array_expr in unit tests 1a08772 Tests where model involves long strings a6c4010 Stop using concretize_arrays_in_expression 6d87233 Use concretize instead of fill_in_array 052d503 Use get_array in get_char_array_and_concretize 8ed138b Remove unused header c58ac60 Avoid copy in ranged for loops over expressions beff419 Use sparse arrays in get_array 9b286d9 Use sparse arrays in string_refinement::get 037f631 Refactor string_refinementt::get dfc584a Add concretize function for interval_sparse_array cb10550 Use sparse arrays in substitute_array_access ce4c008 Remove plus_exprt_with_overflow_check 4779b25 Get rid of calls to plus_exprt_with_overflow c40b836 Truncate string concatenations in case of overflow c52c813 Add a sum_overflows function 0d03591 Add an `at` function for access in sparse arrays 848dd95 Add an interval_sparse_array::of_expr function 5f07bf0 Initialization of sparse array from array-list 5b4d618 Reduce number of constraints in format ede2fa1 Clear string_dependencies in calls to dec_solve d483c81 Initialize sparse array from array_exprt a914153 Use map instead of vector for sparse array entries 20d2445 Remove unused rename(expr, old_id, new_id) 430d402 Use exprt::depth_iterator in rename_symbolt 54e5b85 Use a const expr to avoid unnecessary detach 6f71ff6 replace_symbolt: stop early if there is nothing to replace with 1dbb162 Clean locally built SAT solver objects 28c076b Merge pull request diffblue#2013 from LAJW/lajw/java-no-load-class e6a9127 Merge pull request diffblue#2020 from tautschnig/sat-cleanup 70741ff Remove support for Precosat 2aa81eb Remove support for SMVSAT a0fd3f7 Remove support for Limmat as a SAT solver 28cca9c Remove unused DIMACS parser 1d81306 Merge pull request diffblue#2015 from tautschnig/fix-smt2_solver-clean 392144d Makefiles: Place .d suffix used for dependencies in DEPEXT variable 839d32a smt2_solver.{o,d} should be removed by "make clean" 48e427a Merge pull request diffblue#1979 from romainbrenguier/regression/fix-indexOf-test c2f3726 Merge pull request diffblue#1976 from romainbrenguier/regression/activate-dependency-tests 42ecfa2 Add --java-no-load-class option 69fb74a Merge pull request diffblue#1995 from tautschnig/byte-update-soundness aa766ae Set string-max-length in indexOf test 988b818 Merge pull request diffblue#1990 from tautschnig/missing-header 8300147 Abort on byte_update(pointer) 4a8d9b4 Include missing header a695814 Merge pull request diffblue#1986 from thk123/revert/1816/overlay-classes 9933b58 Revert "Merge pull request diffblue#1816 from NathanJPhillips/feature/overlay-methods" cd9b839 Revert "Merge pull request diffblue#1982 from NathanJPhillips/bugfix/load-object-once" 58beeb4 Merge pull request diffblue#1978 from svorenova/lambda_tg2478_cont 1e3a9cd Merge pull request diffblue#1980 from NathanJPhillips/tests/irept 772b603 Merge pull request diffblue#1982 from NathanJPhillips/bugfix/load-object-once 0902ae7 Tests to demonstrate expected sharing behaviour of irept 2239ec1 Unit test of irept's public API 81ac259 Prevent test running on symex-driven lazy loading eae194f Allow incorrect paths for jar files on the classpath without crashing 6749fd0 Tests to ensure invalid values in the classpath are ignored 3c95aa1 Prevent attempting to load any class more than once c520331 Unit test to show java.lang.Object can be loaded from an explicit model 05a7b4a Merge pull request diffblue#1981 from smowton/smowton/cleanup/missing-docstring e4dc2aa Merge pull request diffblue#1946 from thk123/feature/TG-1811/unit-tests-for-invokedynamic-static-lambda 2b53d3b Merge pull request diffblue#1983 from svorenova/lambda_tg2478_fix f428247 Tidying up for lambdas f3b1379 Merge pull request diffblue#1975 from romainbrenguier/bugfix/literal-length-TG2878 7d4441d Regression test for lambda in a package 7366fda Format class name for lambda method handles 7db42c0 Add missing Doxygen parameter description 31fa0fe Addressing review comments 2a0927c Merge pull request diffblue#1643 from NathanJPhillips/bugfix/string-solver-function-type-mistake 229d1ee Merge pull request diffblue#1977 from romainbrenguier/bugfix/string-equals-TG-2179 97a6713 Merge pull request diffblue#1816 from NathanJPhillips/feature/overlay-methods e2d4b09 Updates for merge 674c9f0 Activate tests for StringBuffer concat in loops a997ffb Add verification test for String.equals 1ab596d Merge commit '3b8120f3a8c9ed3a343493a44ac454ae265946c1' into develop f7602af Merge commit 'bb88574aaa4043f0ebf0ad6881ccaaeb1f0413ff' into merge-develop-20180327 55d36b5 Fix code for String.equals baf33f8 Added regression tests 9984fc1 Add ability to overlay classes with new definitions of existing methods 66c529c Tidied up java_class_loader_limitt 397c14e Correcting typos and adding documentation to unit tests 54f1c54 Adding checks for the super class of the generated class df895d3 Temp checkin for checking components 44a5dcb Adding check for inheritance 28bfc37 Amending path to reflect new location e27151b Correcting typo in the scenario name d7356be Modified behaviour to find function calls ac33761 Use raw strings to avoid unnecessary escaping 4201db9 Adding tests for static lambdas f9adaa6 Adding tests for member methods 5f5994b Pull out the logic for getting the inital assignment 7f843a7 Add natural language explanation of the test's checks fa27117 Introduce tests for lambdas that are member variables 8999cf6 Added utiltity for getting this member components 3e0e12e Adding tests for the other two returning lambdas that don't capture f3ddee6 Adding tests to verify the return of the lambda wrapper method db6756e Adding checks for parameters of the called function d2ed92b Adding test for lambda taking array parameters d171f64 Swap finding variable values to use regex 99c21ed Extended find pointer assignment to take a regex 5610aca Added unit test for lambda assigned 7b0cee1 Refactored test method to allow reuse bea730d Adding utility for verifying a set of statements contains a function call ee2179c Introduce checks the the function body for Execute calls the correct lambda 2348d10 Adding unit test for checking local lambda conversion 46fa176 Extending require utilities to be used in test 6df8d6b Extended require_goto_statements to provide meaningful errors 39282a6 Add debug information for working directory 770eb2a Remove methods without a implementation or usage 5cbb758 Merge pull request diffblue#1937 from svorenova/lambda_tg2711 8cfd9bf Merge pull request diffblue#1968 from smowton/smowton/cleanup/remove-exceptions-clarity 6ca3272 Merge pull request diffblue#1973 from karkhaz/kk-c++2a-fixes 212da75 Making members of a test utility class non-const d57fe53 Renaming the folder with lambda unit tests 823f2a7 Adding a unit test for lambda method handles in class symbol 8b172b4 Adding a utility function for lambda method handles in struct 844bb20 Pulling out a utility function to a separate file 3585f73 Updating unit tests for parsing lambda method table 078dc0f Updating a utility function a8ac3d4 Pass lambda method handles to method instruction conversion bf04c93 Add lambda method handles during class conversion a518393 Introduce lambda method handles in java class type 37afd9a Store the full method reference of lambda method handles 0a49697 Store bootstrap method index for invokedynamic instructions 56c9c02 Add test for string literals length af958f0 Correct length field of string literals a190534 Remove-exceptions: make lambda types explicit b052a4d Fix warnings emitted by C++2a compiler 20eaf21 Fix type of call to forName git-subtree-dir: cbmc git-subtree-split: 7202003985a99fb6563cf4d0fb8e7f2c727cc040
This makes the interpretation of array coming from the underlying solver more uniform: they are always converted to interval_sparse_arrayt and then accessed using
at
orconcretize
.Using can reduce a lot the memory consumption in the case of long strings.
We now set a hard limit on the size of strings that are concretized by the solver (for instance to build a trace).
When this limit is exceeded CBMC aborts with a message hinting to set
--string-max-input-length
to a reasonable value.The ultimate goal would be to get rid of
--string-max-length
and only rely on--string-max-input-length
.