Skip to content

[TG-2721] Use sparse arrays in string_refinementt::get #2009

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

romainbrenguier
Copy link
Contributor

This makes the interpretation of array coming from the underlying solver more uniform: they are always converted to interval_sparse_arrayt and then accessed using at or concretize.
Using can reduce a lot the memory consumption in the case of long strings.
We now set a hard limit on the size of strings that are concretized by the solver (for instance to build a trace).
When this limit is exceeded CBMC aborts with a message hinting to set --string-max-input-length to a reasonable value.
The ultimate goal would be to get rid of --string-max-length and only rely on --string-max-input-length.

@romainbrenguier romainbrenguier changed the title Use sparse arrays in string_refinementt::get [TG-2721] Use sparse arrays in string_refinementt::get Apr 4, 2018
@romainbrenguier romainbrenguier force-pushed the solvers/sparse-arrays-in-get branch from 2278fa3 to 189e161 Compare April 4, 2018 12:36
@romainbrenguier romainbrenguier requested review from allredj and LAJW April 4, 2018 12:37
@romainbrenguier romainbrenguier force-pushed the solvers/sparse-arrays-in-get branch from 639dcce to 4f2748e Compare April 5, 2018 18:42
Copy link
Contributor

@LAJW LAJW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so happy to see the concretization and plus_with_overflow go!

I've just got a couple suggestions, not even nitpicks, the code looks really good as it is.

if(entries.back().second == expr.operands()[i])
entries.back().first = i;
else if(entries.back().second.id() == ID_unknown)
entries.back() = std::make_pair(i, expr.operands()[i]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a more concise way to write this:

entries.back() = { i, expr.operands()[i] };

for(const auto &pair : entries)
{
if(pair.first >= index)
return pair.second;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's worth it, you could rewrite entries to as std::map and then use this function to do this in O(log n):
http://www.cplusplus.com/reference/map/map/upper_bound/


exprt return_code = from_integer(0, get_return_code_type());

if(intermediary_strings.size() == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(intermediary_strings.empty())?

{
for(; current_index <= pair.first && current_index < size; ++current_index)
array.operands().push_back(pair.second);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could remove current_index and add an if statement in this loop:

if (array.operands().size() < size)
  return array;

This should make this function a little bit more obvious.

msg << "consider reducing string-max-input-length so that no string "
<< "exceeds " << MAX_CONCRETE_STRING_SIZE << " in length and make sure"
<< " all functions returning strings are available in the classpath";
throw msg.str();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of throwing strings. How about std::runtime_error(string) or std::invalid_argument(string), if it's not too outlandish of course?

{
auto set = generator.get_created_strings();
if(set.find(arr) != set.end())
exprt length = super_get(arr.length());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const exprt?

{
const auto &if_expr = expr_dynamic_cast<if_exprt>(current.get());
const exprt cond = get(if_expr.cond());
if(cond.is_true())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(get(if_expr.cond()).is_true()) - I like oneliners.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also use cond with .is_false after

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@romainbrenguier romainbrenguier force-pushed the solvers/sparse-arrays-in-get branch 4 times, most recently from 6b60801 to ea88a83 Compare April 9, 2018 15:15
@@ -124,7 +124,19 @@ class interval_sparse_arrayt final : public sparse_arrayt
/// `extra_value`.
interval_sparse_arrayt(const array_exprt &expr, const exprt &extra_value);

/// Initialize an sparse array from an array represented by a list of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a sparse array

@@ -124,7 +124,19 @@ class interval_sparse_arrayt final : public sparse_arrayt
/// `extra_value`.
interval_sparse_arrayt(const array_exprt &expr, const exprt &extra_value);

/// Initialize an sparse array from an array represented by a list of
/// index-value pairs, and setting the default to `extra_value`.
/// Indexes must be constant expressions, and negative indexes are ignored.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can negative indices occur?

return interval_sparse_arrayt(*array_expr, extra_value);
if(const auto &with_expr = expr_try_dynamic_cast<with_exprt>(expr))
return interval_sparse_arrayt(*with_expr);
if(expr.id() == "array-list")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A proper irep_id should be added for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's OK I'd prefer to delay that to anoter PR (prepared here romainbrenguier#7) because it will require other reviewers as it touch other parts of CBMC as well.

@@ -648,20 +648,24 @@ decision_proceduret::resultt string_refinementt::dec_solve()
output_equations(debug(), equations, ns);
#endif

// Dependencies is also used by get, so we have to use it as a class member
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dependencies?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment for commit message: more constraints added and decreased performance

return {};
std::ostringstream msg;
msg << "consider reducing string-max-input-length so that no string "
<< "exceeds " << MAX_CONCRETE_STRING_SIZE << " in length and make sure"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::to_string should be sufficient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but I don't see the advantage. I think streams are nicer in this kind of situations.

@@ -0,0 +1,11 @@
CORE
Test.class
--refine-strings --function Test.check --string-max-input-length 2000000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long do these tests take?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my computer 0.7s for the first one and 0.1s for the second.

@romainbrenguier romainbrenguier force-pushed the solvers/sparse-arrays-in-get branch from ea88a83 to 2ea33db Compare April 10, 2018 07:10
This avoid having to explicitly sort the entries, and make looking up
for an element more efficient: logarithmic instead of linear.
If the content is kept between calls to dec_solve, some nodes can be
duplicated in the graph, which leads to more constraints added and
decreased performance.
We reduce the number of constraints by avoiding the copy of the last string.
This initialize a sparse array from the array expression we know how to
deal with.
Detects when an overflow happens in the sum of two integers.
This will be used in buitin functions for dealing with the overflow case.
This function is adding assumptions on the values of integers which may
lead to contradiction. It is better to deal with overflows at the level
of the specification of the builtin functions instead.
The obtained expression can be exponentially smaller, because sparse
array representation avoids repetitions.
In the case of index_expressions this can use exponentialy less
memory.
This ensures that arrays from the underlying solver are interpreted in a
consistent manner in the solver (always using interval_sparse_arrayt).
This makes sure the way we interpret arrays is consistent even in
debugging functions.
This will allow to remove fill_in_array_expr which duplicates what
concretize does.
This is now unecessary as get_array takes care of the concretization
using sparse arrays.
This allows use to delete concretize_arrays_in_expression which is now
unused.
Equivalent behavior to this function can be obtained using sparse arrays.
We should use sparse arrays as in the solver.
@romainbrenguier romainbrenguier force-pushed the solvers/sparse-arrays-in-get branch from 2ea33db to 64b336a Compare April 10, 2018 07:25
@romainbrenguier romainbrenguier merged commit 768e8a6 into diffblue:develop Apr 10, 2018
smowton pushed a commit to smowton/cbmc that referenced this pull request May 9, 2018
7202003 Merge branch 'develop' of github.com:diffblue/cbmc into CBMC_subtree_2018-04-10
768e8a6 Merge pull request diffblue#2009 from romainbrenguier/solvers/sparse-arrays-in-get
69f6e7b Merge pull request diffblue#2032 from tautschnig/replace-rename-performance
ec43b00 Merge pull request diffblue#2026 from tautschnig/sat-clean
64b336a Refactor interval_sparse_array::concretize
5392645 Reserve size of array in concretize
8277e92 Stop using concretize_array_expr in unit tests
1a08772 Tests where model involves long strings
a6c4010 Stop using concretize_arrays_in_expression
6d87233 Use concretize instead of fill_in_array
052d503 Use get_array in get_char_array_and_concretize
8ed138b Remove unused header
c58ac60 Avoid copy in ranged for loops over expressions
beff419 Use sparse arrays in get_array
9b286d9 Use sparse arrays in string_refinement::get
037f631 Refactor string_refinementt::get
dfc584a Add concretize function for interval_sparse_array
cb10550 Use sparse arrays in substitute_array_access
ce4c008 Remove plus_exprt_with_overflow_check
4779b25 Get rid of calls to plus_exprt_with_overflow
c40b836 Truncate string concatenations in case of overflow
c52c813 Add a sum_overflows function
0d03591 Add an `at` function for access in sparse arrays
848dd95 Add an interval_sparse_array::of_expr function
5f07bf0 Initialization of sparse array from array-list
5b4d618 Reduce number of constraints in format
ede2fa1 Clear string_dependencies in calls to dec_solve
d483c81 Initialize sparse array from array_exprt
a914153 Use map instead of vector for sparse array entries
20d2445 Remove unused rename(expr, old_id, new_id)
430d402 Use exprt::depth_iterator in rename_symbolt
54e5b85 Use a const expr to avoid unnecessary detach
6f71ff6 replace_symbolt: stop early if there is nothing to replace with
1dbb162 Clean locally built SAT solver objects
28c076b Merge pull request diffblue#2013 from LAJW/lajw/java-no-load-class
e6a9127 Merge pull request diffblue#2020 from tautschnig/sat-cleanup
70741ff Remove support for Precosat
2aa81eb Remove support for SMVSAT
a0fd3f7 Remove support for Limmat as a SAT solver
28cca9c Remove unused DIMACS parser
1d81306 Merge pull request diffblue#2015 from tautschnig/fix-smt2_solver-clean
392144d Makefiles: Place .d suffix used for dependencies in DEPEXT variable
839d32a smt2_solver.{o,d} should be removed by "make clean"
48e427a Merge pull request diffblue#1979 from romainbrenguier/regression/fix-indexOf-test
c2f3726 Merge pull request diffblue#1976 from romainbrenguier/regression/activate-dependency-tests
42ecfa2 Add --java-no-load-class option
69fb74a Merge pull request diffblue#1995 from tautschnig/byte-update-soundness
aa766ae Set string-max-length in indexOf test
988b818 Merge pull request diffblue#1990 from tautschnig/missing-header
8300147 Abort on byte_update(pointer)
4a8d9b4 Include missing header
a695814 Merge pull request diffblue#1986 from thk123/revert/1816/overlay-classes
9933b58 Revert "Merge pull request diffblue#1816 from NathanJPhillips/feature/overlay-methods"
cd9b839 Revert "Merge pull request diffblue#1982 from NathanJPhillips/bugfix/load-object-once"
58beeb4 Merge pull request diffblue#1978 from svorenova/lambda_tg2478_cont
1e3a9cd Merge pull request diffblue#1980 from NathanJPhillips/tests/irept
772b603 Merge pull request diffblue#1982 from NathanJPhillips/bugfix/load-object-once
0902ae7 Tests to demonstrate expected sharing behaviour of irept
2239ec1 Unit test of irept's public API
81ac259 Prevent test running on symex-driven lazy loading
eae194f Allow incorrect paths for jar files on the classpath without crashing
6749fd0 Tests to ensure invalid values in the classpath are ignored
3c95aa1 Prevent attempting to load any class more than once
c520331 Unit test to show java.lang.Object can be loaded from an explicit model
05a7b4a Merge pull request diffblue#1981 from smowton/smowton/cleanup/missing-docstring
e4dc2aa Merge pull request diffblue#1946 from thk123/feature/TG-1811/unit-tests-for-invokedynamic-static-lambda
2b53d3b Merge pull request diffblue#1983 from svorenova/lambda_tg2478_fix
f428247 Tidying up for lambdas
f3b1379 Merge pull request diffblue#1975 from romainbrenguier/bugfix/literal-length-TG2878
7d4441d Regression test for lambda in a package
7366fda Format class name for lambda method handles
7db42c0 Add missing Doxygen parameter description
31fa0fe Addressing review comments
2a0927c Merge pull request diffblue#1643 from NathanJPhillips/bugfix/string-solver-function-type-mistake
229d1ee Merge pull request diffblue#1977 from romainbrenguier/bugfix/string-equals-TG-2179
97a6713 Merge pull request diffblue#1816 from NathanJPhillips/feature/overlay-methods
e2d4b09 Updates for merge
674c9f0 Activate tests for StringBuffer concat in loops
a997ffb Add verification test for String.equals
1ab596d Merge commit '3b8120f3a8c9ed3a343493a44ac454ae265946c1' into develop
f7602af Merge commit 'bb88574aaa4043f0ebf0ad6881ccaaeb1f0413ff' into merge-develop-20180327
55d36b5 Fix code for String.equals
baf33f8 Added regression tests
9984fc1 Add ability to overlay classes with new definitions of existing methods
66c529c Tidied up java_class_loader_limitt
397c14e Correcting typos and adding documentation to unit tests
54f1c54 Adding checks for the super class of the generated class
df895d3 Temp checkin for checking components
44a5dcb Adding check for inheritance
28bfc37 Amending path to reflect new location
e27151b Correcting typo in the scenario name
d7356be Modified behaviour to find function calls
ac33761 Use raw strings to avoid unnecessary escaping
4201db9 Adding tests for static lambdas
f9adaa6 Adding tests for member methods
5f5994b Pull out the logic for getting the inital assignment
7f843a7 Add natural language explanation of the test's checks
fa27117 Introduce tests for lambdas that are member variables
8999cf6 Added utiltity for getting this member components
3e0e12e Adding tests for the other two returning lambdas that don't capture
f3ddee6 Adding tests to verify the return of the lambda wrapper method
db6756e Adding checks for parameters of the called function
d2ed92b Adding test for lambda taking array parameters
d171f64 Swap finding variable values to use regex
99c21ed Extended find pointer assignment to take a regex
5610aca Added unit test for lambda assigned
7b0cee1 Refactored test method to allow reuse
bea730d Adding utility for verifying a set of statements contains a function call
ee2179c Introduce checks the the function body for Execute calls the correct lambda
2348d10 Adding unit test for checking local lambda conversion
46fa176 Extending require utilities to be used in test
6df8d6b Extended require_goto_statements to provide meaningful errors
39282a6 Add debug information for working directory
770eb2a Remove methods without a implementation or usage
5cbb758 Merge pull request diffblue#1937 from svorenova/lambda_tg2711
8cfd9bf Merge pull request diffblue#1968 from smowton/smowton/cleanup/remove-exceptions-clarity
6ca3272 Merge pull request diffblue#1973 from karkhaz/kk-c++2a-fixes
212da75 Making members of a test utility class non-const
d57fe53 Renaming the folder with lambda unit tests
823f2a7 Adding a unit test for lambda method handles in class symbol
8b172b4 Adding a utility function for lambda method handles in struct
844bb20 Pulling out a utility function to a separate file
3585f73 Updating unit tests for parsing lambda method table
078dc0f Updating a utility function
a8ac3d4 Pass lambda method handles to method instruction conversion
bf04c93 Add lambda method handles during class conversion
a518393 Introduce lambda method handles in java class type
37afd9a Store the full method reference of lambda method handles
0a49697 Store bootstrap method index for invokedynamic instructions
56c9c02 Add test for string literals length
af958f0 Correct length field of string literals
a190534 Remove-exceptions: make lambda types explicit
b052a4d Fix warnings emitted by C++2a compiler
20eaf21 Fix type of call to forName

git-subtree-dir: cbmc
git-subtree-split: 7202003985a99fb6563cf4d0fb8e7f2c727cc040
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants