-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix tasty hash #4865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tasty hash #4865
Conversation
Previously we where computing the hash for nameBuffer on an empty array.
So how come this didn't cause any test to fail so far (except the scripted test I guess)? And can we add a test for this? |
This has nothing to do with the scripted tests. I just noted that there were too many zeros in the hash. |
True, I'm trying to figure out why this implementation pjwHash is only returning |
Originaly it was a C `unsigned char`
b1a96ca
to
8db8ca1
Compare
Previously the constants where set to generate 32-bit ints.
8db8ca1
to
05ab4cf
Compare
5ab850b
to
d6898b9
Compare
val d = data(i) & 0xFFL // Interpret byte as unsigned byte | ||
h = (h << 8) + d | ||
val high = h & 0xFF00000000000000L | ||
if (high != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the if, it's not necessary and branches should be avoided in tight loops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@smarter don't forget to finish the review on this PR |
The review on the algorithm can be separated to offload @smarter. I've taken two difference reference C implementations, ported them to 64 bits following Wikipedia and its DrDobbs source, and gotten 0x34545c16020230 as hash for EDIT: not sure we should use this hash, since the original argument didn't work as well as hoped:
EDIT 2: the "reference implementation" I dug out is https://github.com/Blaisorblade/general-purpose-hash-functions |
@@ -0,0 +1,25 @@ | |||
import dotty.tools.dotc.core.tasty.TastyHash.pjwHash64 | |||
|
|||
object Test { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would write a junit test instead, they're faster to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Found by comparing testcases with the C version, and by then comparing the source code; both C versions use an unsigned right shift (or rather a right shift on unsigned integers). * Background: Right-shifting a signed number copies the sign bit into all the new "empty bits"; C uses the integer sign, while Java (and Scala) use a separate "unsigned right shift" operator.
FWIW I fixed a hash bug after @smarter's review — but the hashing details can be reviewed by non-Tasty experts. |
@Blaisorblade fix LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicolasstucki Fix LGTM, pushed a nitpick, reassigning to you 😉
There were two bugs
nameBuffer
was not computedThe combination of the two bugs made the 128-bit UUID have
0
s in the first 96 bits.