-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix #2185: Add bytecode idempotency checks. #2272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -14,7 +14,7 @@ import Decorators._ | |||
class TastyPickler { | |||
|
|||
private val sections = new mutable.ArrayBuffer[(NameRef, TastyBuffer)] | |||
val uuid = UUID.randomUUID() | |||
private val uuid = UUID.fromString("3cee1b79-c03a-4125-b337-d067b5cb3a94") // TODO: use a hash of the tasty tree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to merge this change as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DarkDimius, you talked about making the hash for it. Did you have something specific in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DarkDimius I changes the tasty UUID to a hash of the nameBuffer
and sections
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This idea looks good to me.
val blacklisted = Set( | ||
"pos/Map/scala/collection/immutable/Map", | ||
"pos/Map/scala/collection/immutable/AbstractMap", | ||
"pos/t1203a/NodeSeq" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a blacklist? If these files fail the test, shoudn't issues be filed for them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will. I just haven't had time to do it yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added them to #2274.
compileFilesInDir("../tests/pos", opt) | ||
} | ||
def idempotency2() = { | ||
compileList("dotty1", compilerSources ++ backendSources ++ backendJvmSources, opt) + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be interesting to combine this with a bootstrap test: Compile the compiler, then compile the compiler with the compiled compiler, and check equality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That will be interesting, but I would add it as a second step in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
5b1dd94
to
6e3b580
Compare
* @see [[http://code.google.com/p/smhasher]] | ||
*/ | ||
object MurmurLongHash3 extends MurmurLongHash3 { | ||
final val arraySeed = 0x3c074a61 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are seeds ints?
* objects. | ||
* | ||
* This algorithm is designed to generate well-distributed non-cryptographic | ||
* hashes. It is designed to hash data in 32 bit chunks (ints). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64 now, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
|
||
import java.lang.Long.{ rotateLeft => rotl } | ||
|
||
private[util] class MurmurLongHash3 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I think this should be contributed upstream to scala-library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. But I need to reimplement the algorithm from the source. @DarkDimius showed me some issues with my simple port. I will try to make it compatible MurmurHash3
to be able to just add it inside in the future.
As discussed in person, using murmur3 to emmit 64-bit hashes is not evidenced to be a good hash. I propose we don't use murmur family for this one, and instead use FarmHash / CityHash, which have been open-sourced by google and are used to hash entries in maps(our use-case). |
Interesting, why? |
37ae02d
to
9f57381
Compare
Updated hashing |
@@ -0,0 +1,186 @@ | |||
package dotty.tools.dotc.util | |||
|
|||
/* Ported from https://github.com/google/cityhash/blob/master/src/city.cc */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should preserve the copyright/license header of the original file, and you should update AUTHORS.md to note this.
|
||
private final def fetch64(idx: Int)(implicit data: Array[Byte]): Long = { | ||
var x: Long = data(idx) | ||
x = data(idx + 1) | (x << 8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Careful with sign extension (because Byte
is signed). If I recall correctly, you need to add& 0xff
everywhere you convert from a Byte
to an Int
to avoid that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should add some unit tests for this class, to make sure it behaves like the reference implementation.
9f57381
to
eb4be1f
Compare
@DarkDimius the hash function was updated. |
While real cryptographic hashes are expected to remain "good" if you take a subsection of it, you don't have this guarantee for non-cryptograhic hashes in general. That's why I've proposed FarmHash / CityHash that have native 64-bit variants. @nicolasstucki, I've lost context, we did we decide to use Elf hashing instead of CityHash? |
The context for using Elf hashing instead of CityHash was because the implementation is trivial and it is simpler to be sure that it is correctly implemented. But, if we want to use CityHash we can just revert 426a33a. |
eb4be1f
to
000b432
Compare
@DarkDimius - is this good to go now? |
Add
bytecodeIdemporency
to compilation tests. This tests compiles any compilation test twice and compares the resulting bytecodes (currentlydotty-compiler
andtests/pos
).