-
Notifications
You must be signed in to change notification settings - Fork 273
Add code_typet::get_parameter_indices #1688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
smowton
merged 1 commit into
diffblue:develop
from
smowton:smowton/feature/parameter_indices
Jan 5, 2018
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
|
||
public class ParameterIndicesTest { | ||
|
||
public void f(ParameterIndicesTest param1, int param2) {} | ||
|
||
public static void g(float param1, ParameterIndicesTest param2) {} | ||
|
||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
/*******************************************************************\ | ||
|
||
Module: Parameter indices test | ||
|
||
Author: Diffblue Limited. All rights reserved. | ||
|
||
\*******************************************************************/ | ||
|
||
#include <testing-utils/catch.hpp> | ||
#include <testing-utils/load_java_class.h> | ||
#include <util/std_types.h> | ||
|
||
void check_consistency(const symbolt &symbol) | ||
{ | ||
const auto &code_type = to_code_type(symbol.type); | ||
auto parameter_ids = code_type.parameter_identifiers(); | ||
auto parameter_indices = code_type.parameter_indices(); | ||
|
||
REQUIRE(parameter_ids.size() == parameter_indices.size()); | ||
for(std::size_t i = 0; i < parameter_ids.size(); ++i) | ||
REQUIRE(parameter_indices.at(parameter_ids.at(i)) == i); | ||
} | ||
|
||
TEST_CASE("Parmeter indices consistency", "[core][util][parameter_indices]") | ||
{ | ||
symbol_tablet symbol_table = load_java_class("ParameterIndicesTest", "util/"); | ||
check_consistency( | ||
symbol_table.lookup_ref( | ||
"java::ParameterIndicesTest.f:(LParameterIndicesTest;I)V")); | ||
check_consistency( | ||
symbol_table.lookup_ref( | ||
"java::ParameterIndicesTest.g:(FLParameterIndicesTest;)V")); | ||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this might not work as nicely as intended: hash values of
dstringt
(whichirep_idt
by default is) aren't uniformly distributed across 2^32 or 2^64. Given that the number of parameters is typically fairly small, chances are that all your elements end up in one bucket despite all the effort.If you have some time to spare, it would be really interesting if you could do some debug runs where you log the distribution across buckets that you get to see in this bit of code (at the end of the function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't run the test, but I note that the hash value of
dstringt
is just its number, and thestd::hash
specialisations shipped with libstdc++ for all primitive types of size <= sizeof(size_t) simply cast to size_t. The authors of the standard library will surely have anticipatedstd::unordered_map
keyed by densely packed integers, so dstring seems likely to perform just as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specific relevant excerpt from
bits/functional_hash.h
:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to believe this, had I not done the painful experiments behind the write-up at
cbmc/src/util/irep_hash.h
Line 95 in fabc99e
So maybe you are right, maybe you aren't. The only thing I'd truly trust is data across different platforms.
Note that all this isn't a big deal here for we are likely not dealing with big data sets and probably aren't on a hot path. Just saying this as data must prevail over instincts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like the opposite problem to me -- lots of the irep's hash is in high bits, which get ignored when picking a bucket, whereas for dense integers there's lots of useful information in low bits. I believe https://tessil.github.io/2016/08/29/benchmark-hopscotch-map.html for example is testing this exact case with
unordered_map
, though tests withlibc++
and MSVC's STL implementation would be good to see too.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be very clear: this is a purely academic discussion here, given the anticipated small data and rare use.
I do agree that you've got the opposite case, but that would just imply that you observe the opposite performance penalty between GNU STL vs. MSVC. The comparison you pointed to is a good read, but is of course on completely artificial data. It is important to contrast this with the profile of actual data to be observed at the point of use. In your specific case, I'd expect many uses might have just 2 or 3 buckets. The parameter identifiers will likely have been created in immediate succession. Thus, for the 2-parameter case we'd have an even and an odd number, and you'll be doing great with MSVC. But really you'll not even be able to observe a difference over using a map that does not use hashing, or using whatever the worst performing hash map for the given hash function.
My view is: keep things simple where possible, else optimise as far as possible and back that up by data.