Skip to content

Optimize remove_level_2 #3559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

romainbrenguier
Copy link
Contributor

Instead of rebuilding the identifier from scratch when we remove the level 2 of the ssa_exprt, we can just strip the suffix, which is more efficient. On benchmarks I ran locally (70 methods from tika) this has improved the execution time of symex by about 8% (from 77.2 sec to 70.8).

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • [na] The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • [na] White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this would break any work towards field sensitivity, which build_ssa_identifier_rec takes care of.

@romainbrenguier
Copy link
Contributor Author

@tautschnig

No, this would break any work towards field sensitivity, which build_ssa_identifier_rec takes care of.

What is field sensitivity?

@tautschnig
Copy link
Collaborator

What is field sensitivity?

Cf. #2574: we would treat individual members of structs or arrays as separate entities in the SSA encoding for the purpose of field-level constant propagation and smaller SAT encodings. Especially for Java it should be highly beneficial. @smowton has poked me about it several times...

@romainbrenguier
Copy link
Contributor Author

@tautschnig

Cf. #2574: we would treat individual members of structs or arrays as separate entities in the SSA encoding for the purpose of field-level constant propagation and smaller SAT encodings. Especially for Java it should be highly beneficial. @smowton has poked me about it several times...

It sounds interesting but I'm not sure this would be affected by this PR. I will investigate another way to improve how ssa_exprt's are used in symex.

@tautschnig
Copy link
Collaborator

It sounds interesting but I'm not sure this would be affected by this PR.

The code in build_ssa_identifier_rec (which update_identifier invokes) would build identifiers like a!0@1#2.some_member for a struct member expression a.some_member. The change that this PR introduces would turn such an expression into a!0@1 upon a call to remove_level_2, when the correct result is [email protected]_member.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 4935a90).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/94340800

@romainbrenguier
Copy link
Contributor Author

@tautschnig I've changed my approach to optimize the function, this version should be cleaner. I've also added a couple of commits to make the interface of these classes clearer.

@@ -82,7 +82,7 @@ bool ssa_exprt::can_build_identifier(const exprt &expr)
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message typo: "and probably shouldn't." -> "and probably shouldn't be part of it."

@smowton
Copy link
Contributor

smowton commented Dec 12, 2018

Unless I'm misreading this changes very little? Previously inline functions aren't anymore, a couple of functions are static rather than members, and a single "if-is-ssa-expr" test is eliminated? Not objecting, just checking if something has been missed out in shuffling the commits; it currently looks like a cleanup with perhaps 1% performance impact rather than an optimisation.

@romainbrenguier
Copy link
Contributor Author

@smowton

Unless I'm misreading this changes very little? Previously inline functions aren't anymore, a couple of functions are static rather than members, and a single "if-is-ssa-expr" test is eliminated? Not objecting, just checking if something has been missed out in shuffling the commits; it currently looks like a cleanup with perhaps 1% performance impact rather than an optimisation.

This is mostly true. The essential change is the remove_level_2 simplification. I would expect more than 1% improvement; I will run some benchmarks to confirm that.

@romainbrenguier
Copy link
Contributor Author

I would expect more than 1% improvement; I will run some benchmarks to confirm that.

Yes, a get more than 5% improvement on my benchmarks.

@smowton
Copy link
Contributor

smowton commented Dec 12, 2018

Just from removing that check? Impressive!

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: dfd511e).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/94495991

@romainbrenguier
Copy link
Contributor Author

@smowton

Just from removing that check? Impressive!

It's more than a check because update_identifier is rebuilding the identifier strings from scratch, and it's called very frequently during symex.

@romainbrenguier
Copy link
Contributor Author

@tautschnig are you happy with the changes?

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks good to me except there may be further room for improvement by dropping one change as noted below.

@@ -114,5 +114,5 @@ void ssa_exprt::update_identifier()
void ssa_exprt::remove_level_2()
{
remove(ID_L2);
update_identifier();
set_identifier(get_l1_object_identifier());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change: shouldn't you move it back to the header file as now everything can be inlined?

This is not used outside of the ssa_exprt class and shouldn't
be part of the public interface.
This is only used in one function.
For ssa_exprt's this always remove_level_2
To reset the identifier we can just used the l1_object_identifier which
is a cached version of the level1 identifier.
@romainbrenguier
Copy link
Contributor Author

The CI failure is caused by doxygen not being able to be downloaded and installed

@romainbrenguier romainbrenguier merged commit 58d440a into diffblue:develop Dec 14, 2018
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 7114c57).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/94754125

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants