-
Notifications
You must be signed in to change notification settings - Fork 274
Refactor and optimise size to have approximate bound #5914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,16 +24,28 @@ Author: Daniel Kroening, [email protected] | |
|
||
#include <stack> | ||
|
||
std::size_t exprt::size() const | ||
/// Returns the size of the exprt added to count without searching significantly | ||
/// beyond the supplied limit. | ||
std::size_t exprt::bounded_size(std::size_t count, std::size_t limit) const | ||
{ | ||
// Initial size of 1 to count self. | ||
std::size_t size = 1; | ||
for(const auto &op : operands()) | ||
const auto &ops = operands(); | ||
count += ops.size(); | ||
for(const auto &op : ops) | ||
{ | ||
size += op.size(); | ||
if(count >= limit) | ||
{ | ||
return count; | ||
} | ||
count = op.bounded_size(count, limit); | ||
} | ||
return count; | ||
} | ||
|
||
return size; | ||
/// Returns the size of the exprt without significantly searching beyond the | ||
/// supplied limit. | ||
std::size_t exprt::bounded_size(std::size_t limit) const | ||
{ | ||
return bounded_size(1, limit); | ||
} | ||
|
||
/// Return whether the expression is a constant. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of structural sharing this is (unnecessarily) exponential. Maybe use
visit
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't
visit
always do a full traversal, thus removing this optimisation of only doing a partial traversal?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TGWDB Side note, please split this commit into the correctly titled optimisation and bug fix commits. I think the current title may already have mislead Martin. So lets get it sorted.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martin-cs I'm afraid I don't fully understand your comment. In particular I don't understand the terminology "structural sharing" here. There are a few things that have come up as possible, and some comments addressing them below.
ops
is potentially copied. I've pushed a commit to make it clear thatops
is never changed and so there should be no issues related to copying ofops
.size()
call is a constant for this datatype, and then the (depth first) walk of the exprt is linear.operands
is added, thus capturing more of the true size faster and so reducing function calls. Secondly, the bounded nature means this should (particularly when called on large structures) return without traversing the entire exprt.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TGWDB CPROVER shares
irept
s where possible. This can result in exponential compression. Consider the following expression:CPROVER will store a total of 5 irepts. Four of them will have the same irept appearing twice in the
sub
.A naive traversal of the form that you had written will visit
a
16 times rather than just once. As the sharing gets larger so the time taken becomes exponential in the size of the data structure rather than linear. This happens. It is A Problem. In the best cases the sharing is linked enough that the pass grinds to a halt and it is obvious that the code needs to be fixed. Worse is when it is an issue but no-one notices the 8* slow down on the average case of one pass. Then at some point we feed in code and ... it just times out somewhere random and without careful and detailed profiling (which you almost certainly can't do with customer code) you will not find the problem.I say this from bitter personal experience. @tautschnig and @peterschrammel have both done great work on making things sharing aware and sharing conscious.
So, if you want to traverse an expression for an unlimited amount PLEASE do not write for loops and recursion. Use a work queue and a set for what you have seen. Or use the visitor; that is is what it is there for.
Also, in the specific use-case you want, I think you probably want the number of unique exprts, you don't want to count duplicates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this explanation. The details of this particular sharing issue are useful. However I have just been looking at the implementation details of
exprt::visit
and I do not see how it could solve the particular issue you describe. The implementation for this function can be found invisit_pre_template
. It uses anstd::stack
as its work queue (it also uses a for loop which you object to).std::stack
is a simple LIFO data structure. If the sameirept
is pushed onto the stack twice, then it will be popped twice and visited twice. It does not offer the duplication removal ofstd::set
orstd::unordered_set
. The use of thestd::stack
is still worthwhile here, because storing the working data set on the heap avoids storing it in the call stack as for a recursive implementation. Recursion could overflow the call stack when processing a sufficiently deeply nested data structure. Using the visitor is good, I just don't see how it could offer the particular benefit which you purport that it does. Was there a different visitor implementation which you were thinking of instead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasspriggs apologies. There used to be a visitor that used a set to handle sharing. I would consider this a bug in
exprt::visit
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. I just wanted to get to the bottom of what best practice would actually be.
Changing
exprt::visit
to not visit shared nodes like that could break existing usages of it. Say for example if it has been used for printing code, where we actually want to print the full tree. Theexprt::unique_depth_(begin/end/cbegin/cend)
iterators look to offer the functionality you were expecting. But they also seem to be unused.