Skip to content

Simplify when combining string-refinement result constraints #4957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

JohnDumbell
Copy link
Contributor

@JohnDumbell JohnDumbell commented Jul 29, 2019

Due to the improvements in constant prop we increasingly have constraints that are trivially simplified, like:

if 5 >= 2 do x else y

Also because of the way these conditions are combined - the resulting merged if gets added to all subsequent conditions and values - we end up with expressions like this:

if 5 >= 2 then (if 5 >= 2 do x ... else y ...) else (if 5 >= 2 do x ... else y ...)

Where the size of the expression increases exponentially to the complexity of the original if statement even if the entire conditional is trivial to prove right/wrong. This could likely be worked out by the SAT solver happily enough but due to the size of the resulting expressions (I've seen ones 1.34 billion nodes deep) it never gets there within a reasonable time-frame.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@JohnDumbell JohnDumbell self-assigned this Jul 29, 2019
@JohnDumbell JohnDumbell force-pushed the jd/feature/string_refine_combine_simplify branch 4 times, most recently from 0976b08 to 40a3138 Compare July 30, 2019 09:30
@JohnDumbell JohnDumbell force-pushed the jd/feature/string_refine_combine_simplify branch from 40a3138 to c70f7ef Compare July 31, 2019 10:07
@codecov-io
Copy link

codecov-io commented Jul 31, 2019

Codecov Report

Merging #4957 into develop will decrease coverage by 0.03%.
The diff coverage is 76.56%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4957      +/-   ##
===========================================
- Coverage    69.47%   69.44%   -0.04%     
===========================================
  Files         1310     1310              
  Lines       108763   108747      -16     
===========================================
- Hits         75566    75520      -46     
- Misses       33197    33227      +30
Impacted Files Coverage Δ
src/solvers/strings/string_builtin_function.h 77.41% <ø> (ø) ⬆️
src/solvers/strings/string_constraint_generator.h 100% <ø> (ø) ⬆️
...rs/strings/string_constraint_generator_testing.cpp 100% <100%> (ø) ⬆️
...ngs/string_constraint_generator_transformation.cpp 95.03% <100%> (-0.77%) ⬇️
...rs/strings/string_constraint_generator_indexof.cpp 100% <100%> (ø) ⬆️
...trings/string_constraint_generator_code_points.cpp 100% <100%> (ø) ⬆️
.../strings/string_concatenation_builtin_function.cpp 83.9% <100%> (-0.19%) ⬇️
src/solvers/strings/string_builtin_function.cpp 66.66% <100%> (ø) ⬆️
...ers/strings/string_constraint_generator_insert.cpp 59.25% <36.84%> (+1.42%) ⬆️
.../strings/string_constraint_generator_constants.cpp 65.51% <62.5%> (+2.18%) ⬆️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bb6baff...8868378. Read the comment docs.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: c70f7ef).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/121394358

merge(result2.second, std::move(result1.second));
simplify(return_code, ns);

return {return_code, std::move(result2.second)};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest to use return {simplify_expr(return_code, ns), std::move(result2.second)}; at which point the return_code can be made const exprt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

merge(result2.second, std::move(result1.second));
simplify(return_code, ns);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change should be documented

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anywhere specific?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the documentation of combine_results

@JohnDumbell JohnDumbell force-pushed the jd/feature/string_refine_combine_simplify branch from c70f7ef to 46cf072 Compare August 5, 2019 16:57
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 46cf072).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/121998505

@romainbrenguier romainbrenguier removed their assignment Aug 6, 2019
@@ -78,6 +78,7 @@ add_axioms_for_empty_string(const function_application_exprt &f)
/// \param arg: expression of type string typet
/// \param guard: condition under which `res` should be equal to arg
/// \param array_pool: pool of arrays representing strings
/// \param ns: the namespace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Maybe it will be good to say that this argument is so that we can simplify some expressions. We could pass a function simplify_expr as argument instead to make that obvious (just an idea)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe we could have a class and have some const members (such as a namespacet)? The architecture of the string solver code makes it really painful to add new capabilities.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could basically use the string_constraint_generatort class for that (either pass it as an argument of the function or make the function a member), but we should clean-up the members of the class first: the string_constraintst field should just be an output of the methods and not a field and I think the messaget field does not belong there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what exactly the right solution looks like, but I get the impression that making everything free functions has been taken a little too far here. It severely hampers maintainability. The PR we are looking at could have been a one-line change, but instead it's currently +87/-48. We need to make sure we find a good balance between perfectly clean code and maintainability.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the clean-up I suggest #4995, you could then replace the fresh_symbol, array_pool, and ns arguments in these functions by just a string_constraint_generatort object and then further changes like this one could be small.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll happily do that. I was tempted to do it originally while doing this, but the changes were just under the acceptable threshold for me.

@smowton
Copy link
Contributor

smowton commented Aug 13, 2019

Is this good to go?

@JohnDumbell
Copy link
Contributor Author

Not yet, I'll make the required changes now to help clean it up then should be good.

@JohnDumbell JohnDumbell force-pushed the jd/feature/string_refine_combine_simplify branch 2 times, most recently from 902389c to 326d02c Compare August 13, 2019 14:28
@JohnDumbell JohnDumbell force-pushed the jd/feature/string_refine_combine_simplify branch 3 times, most recently from e1c7a85 to dad8579 Compare August 14, 2019 11:16
@JohnDumbell
Copy link
Contributor Author

JohnDumbell commented Aug 14, 2019

I've made the changes which results in the original change being one line. Right now the clean-up commits are all in separate blocks of work to make things a little easier to see what's been done, but I'll squash them together before merging.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: dad8579).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123219879


/// \todo The specification may not be correct for the case where the
/// string is shorter than end. An actual java program should throw an
/// exception in that case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 this looks outdated compared to the add_axioms_for_substring(const function_application_exprt) (but not related to this PR)

const function_application_exprt &f,
array_poolt &array_pool);
public:
std::pair<exprt, string_constraintst> add_axioms_for_concat(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 at some point we should remove the add_ from these method names, since it doesn't add them but return them (doesn't need to be in this PR)

auto pair = add_axioms_for_insert(
generator.fresh_symbol, result, input1, input2, args[0], array_pool);
auto pair = generator.add_axioms_for_insert(
result, input1, input2, args[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this commit should be squashed in the previous one if it doesn't compile without it

@JohnDumbell JohnDumbell force-pushed the jd/feature/string_refine_combine_simplify branch from dad8579 to 8868378 Compare August 14, 2019 13:54
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 8868378).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123237780

Also removed all arguments that can now be accessed directly via
their fields.
Due to the improvements in constant prop we increasingly have constraints
that are trivially simplified, like:

if 5 >= 2 do x else y

Also because of the way these conditions are combined - the resulting
merged if gets added to all subsequent conditions and values - we end up
with expressions like this:

if 5 >= 2 then (if 5 >= 2 do x ... else y ...) else (if 5 >= 2 do x ... else y ...)

Where the size of the expression increases exponentially to the
complexity of the original if statement even if the entire conditional
is trivial to prove right/wrong. This could likely be worked out by the
SAT solver happily enough but due to the size of the resulting
expressions (I've seen ones 1.34 billion nodes deep) it never gets there
within a reasonable time-frame.
@JohnDumbell JohnDumbell force-pushed the jd/feature/string_refine_combine_simplify branch from 8868378 to 303ec4d Compare August 29, 2019 10:47
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 303ec4d).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/125072367

@JohnDumbell JohnDumbell merged commit 73e410a into diffblue:develop Sep 3, 2019
@JohnDumbell JohnDumbell deleted the jd/feature/string_refine_combine_simplify branch September 3, 2019 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants