Add a generic deserializer for Java/Scala 2.12 lambdas #37

retronym · 2015-05-13T06:48:43Z

Java support serialization of lambdas by using the serialization
proxy pattern. Deserialization of a lambda uses LambdaMetafactory
to create a new anonymous subclass.

More details of the scheme are documented:

https://docs.oracle.com/javase/8/docs/api/java/lang/invoke/SerializedLambda.html

From those docs:

SerializedLambda has a readResolve method that looks for a
(possibly private) static method called $deserializeLambda$
in the capturing class, invokes that with itself as the first
argument, and returns the result. Lambda classes implementing
$deserializeLambda$ are responsible for validating that the
properties of the SerializedLambda are consistent with a lambda
actually captured by that class.

The Java compiler generates code in $deserializeLambda$ that
switches on the implementation method name and signature to locate
an invokedynamic instruction generated for the particular lambda
expression. Then, the SerializedLambda is further unpacked,
validating that this implementation method still represents the
same functional interface as it did when it was serialized.
(The source may have been recompiled in the interim.)

In Java, serializable lambda expressions are the exception rather than
the rule. In Scala, however, the serializability of FunctionN means
that we would end up generating a large amount of code to support
deserialization.

Instead, we are pursuing an alternative approach in which the
$deserializeLambda$ method is a simple forwarder to the generic
deserializer added here.

This is capable of deserializing lambdas created by the Java compiler,
although this is not its intended use case. The enclosed tests use
Java lambdas.

This generic deserializer also works by calling LambdaMetafactory,
but it does so explicitly, rather than implicitly during linkage
of the invokedynamic instruction.

We have to mimic the caching property of invokedynamic instruction
to ensure we reuse the classes when constructing. The cache here
uses weak references to keys and values to avoid retention of Class
or ClassLoader instances.

If the name or signature of the implementation method has changed,
we fail during deserialization with an IllegalArgumentError.

However, we do not fail fast in a few cases that Java would, as we
cannot reflect on the "current" functional interface supported by
this implementation method. We just instantiate using the "previous"
functional interface class/method.

This might:

fail inside LambdaMetafactory if the new implementation
method is not compatible with the old functional interface.
pass through LambdaMetafactory by chance, but fail
when instantiating the class in other cases. For example:

% tail sandbox/test{1,2}.scala
==> sandbox/test1.scala <==
class C {
  def test: (String => String) = {
    val s: String = ""
    (t) => s + t
  }
}

==> sandbox/test2.scala <==
class C {
  def test: (String, String) => String = {
    (s, t) => s + t
  }
}
% (for i in 1 2; do scalac -Ydelambdafy:method -Xprint:delambdafy sandbox/test$i.scala 2>&1 ; done) | grep 'def $anon'
    final <static> <artifact> private[this] def $anonfun$1(t: String, s$1: String): String = s$1.+(t);
    final <static> <artifact> private[this] def $anonfun$1(s: String, t: String): String = s.+(t);

Silently create an instance of the old functional interface.
For example, imagine switching from FuncInterface1 to
FuncInterface2 where these were identical other than the name.

I don't believe that these are showstoppers.

retronym · 2015-05-13T06:49:00Z

Review by @lrytz

lrytz · 2015-05-13T10:05:12Z

src/main/java/scala/compat/java8/runtime/LambdaDeserializer.scala

+   *    not stored in `SerializedLambda`, so we can't reconstitute them.
+   *  - No additional bridge methods are passed to `altMetafactory`. Again, these are not stored.
+   *
+   * Note: The Java compiler


incomplete ℹ️

lrytz · 2015-05-13T14:11:27Z

The code and the approach LGTM, really nice.

retronym · 2015-05-14T06:01:48Z

Ready for re-review, @lrytz

To support serialization, we use the alternative lambda metafactory that lets us specify that our anonymous functions should extend the marker interface `scala.Serializable`. They will also have a `writeObject` method added that implements the serialization proxy pattern using `j.l.invoke.SerializedLamba`. To support deserialization, we synthesize a `$deserializeLamba$` method in each class. This will be called reflectively by `SerializedLambda#readResolve`. This method in turn delegates to `LambdaDeserializer`, currently defined [1] in `scala-java8-compat`, that uses `LambdaMetafactory` to spin up the anonymous class and instantiate it with the deserialized environment. Note: `LambdaDeserializer` reuses the anonymous class on subsequent deserializations of a given lambda, in the same spirit as an invokedynamic call site only spins up the class on the first time it is run. `LambdaDeserializer` will be moved into our standard library in the 2.12.x branch, where we can introduce dependencies on the Java 8 standard library. The enclosed test cases must be manually run with indylambda enabled. Once we enable indylambda by default on 2.12.x, the test will actually test the new feature. ``` % echo $INDYLAMBDA -Ydelambdafy:method -Ybackend:GenBCode -target:jvm-1.8 -classpath .:scala-java8-compat_2.11-0.5.0-SNAPSHOT.jar % $INDYLAMBDA -e "println((() => 42).getClass)" class Main$$anon$1$$Lambda$1/1183231938 % qscala $INDYLAMBDA -e "assert(classOf[scala.Serializable].isInstance(() => 42))" % qscalac $INDYLAMBDA test/files/run/lambda-serialization.scala && qscala $INDYLAMBDA Test ``` [1] scala/scala-java8-compat#37

To support serialization, we use the alternative lambda metafactory that lets us specify that our anonymous functions should extend the marker interface `scala.Serializable`. They will also have a `writeObject` method added that implements the serialization proxy pattern using `j.l.invoke.SerializedLamba`. To support deserialization, we synthesize a `$deserializeLamba$` method in each class. This will be called reflectively by `SerializedLambda#readResolve`. This method in turn delegates to `LambdaDeserializer`, currently defined [1] in `scala-java8-compat`, that uses `LambdaMetafactory` to spin up the anonymous class and instantiate it with the deserialized environment. Note: `LambdaDeserializer` reuses the anonymous class on subsequent deserializations of a given lambda, in the same spirit as an invokedynamic call site only spins up the class on the first time it is run. `LambdaDeserializer` will be moved into our standard library in the 2.12.x branch, where we can introduce dependencies on the Java 8 standard library. The enclosed test cases must be manually run with indylambda enabled. Once we enable indylambda by default on 2.12.x, the test will actually test the new feature. ``` % echo $INDYLAMBDA -Ydelambdafy:method -Ybackend:GenBCode -target:jvm-1.8 -classpath .:scala-java8-compat_2.11-0.5.0-SNAPSHOT.jar % $INDYLAMBDA -e "println((() => 42).getClass)" class Main$$anon$1$$Lambda$1/1183231938 % qscala $INDYLAMBDA -e "assert(classOf[scala.Serializable].isInstance(() => 42))" % qscalac $INDYLAMBDA test/files/run/lambda-serialization.scala && qscala $INDYLAMBDA Test ``` This commit contains a few minor refactorings to the code that generates the invokedynamic instruction to use more meaningful names and to reuse Java signature generation code in ASM rather than the DIY approach. [1] scala/scala-java8-compat#37

lrytz · 2015-05-15T07:56:05Z

src/main/java/scala/compat/java8/runtime/LambdaDeserializer.scala

+ * This class is only intended to be called by synthetic `$deserializeLambda$` method that the Scala 2.12
+ * compiler will add to classes hosting lambdas.
+ *
+ * It is intended to be consumed directly.


Not sure what message this phrase is conveying :) - or does it miss a "not"?

Correct 😄

retronym · 2015-05-15T09:41:34Z

Yep, that's right. Note that JFUnctionN doesn't implement Serializable, either.

retronym · 2015-05-15T09:42:41Z

s/faithfully unknown/faithfully deserialize unknown/ above.

retronym · 2015-05-15T22:20:18Z

See also:

http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs.devel/4895
https://bugs.openjdk.java.net/browse/JDK-6493635

retronym · 2015-05-15T22:25:33Z

One thing we could consider is synthesizing a static cache per-class. We could then reason about the object lifetimes by analogy to the reflection caches for structural type invocations.

To support serialization, we use the alternative lambda metafactory that lets us specify that our anonymous functions should extend the marker interface `scala.Serializable`. They will also have a `writeObject` method added that implements the serialization proxy pattern using `j.l.invoke.SerializedLamba`. To support deserialization, we synthesize a `$deserializeLamba$` method in each class with lambdas. This will be called reflectively by `SerializedLambda#readResolve`. This method in turn delegates to `LambdaDeserializer`, currently defined [1] in `scala-java8-compat`, that uses `LambdaMetafactory` to spin up the anonymous class and instantiate it with the deserialized environment. Note: `LambdaDeserializer` reuses the anonymous class on subsequent deserializations of a given lambda, in the same spirit as an invokedynamic call site only spins up the class on the first time it is run. `LambdaDeserializer` will be moved into our standard library in the 2.12.x branch, where we can introduce dependencies on the Java 8 standard library. The enclosed test cases must be manually run with indylambda enabled. Once we enable indylambda by default on 2.12.x, the test will actually test the new feature. ``` % echo $INDYLAMBDA -Ydelambdafy:method -Ybackend:GenBCode -target:jvm-1.8 -classpath .:scala-java8-compat_2.11-0.5.0-SNAPSHOT.jar % $INDYLAMBDA -e "println((() => 42).getClass)" class Main$$anon$1$$Lambda$1/1183231938 % qscala $INDYLAMBDA -e "assert(classOf[scala.Serializable].isInstance(() => 42))" % qscalac $INDYLAMBDA test/files/run/lambda-serialization.scala && qscala $INDYLAMBDA Test ``` This commit contains a few minor refactorings to the code that generates the invokedynamic instruction to use more meaningful names and to reuse Java signature generation code in ASM rather than the DIY approach. [1] scala/scala-java8-compat#37

To support serialization, we use the alternative lambda metafactory that lets us specify that our anonymous functions should extend the marker interface `scala.Serializable`. They will also have a `writeObject` method added that implements the serialization proxy pattern using `j.l.invoke.SerializedLamba`. To support deserialization, we synthesize a `$deserializeLamba$` method in each class with lambdas. This will be called reflectively by `SerializedLambda#readResolve`. This method in turn delegates to `LambdaDeserializer`, currently defined [1] in `scala-java8-compat`, that uses `LambdaMetafactory` to spin up the anonymous class and instantiate it with the deserialized environment. Note: `LambdaDeserializer` reuses the anonymous class on subsequent deserializations of a given lambda, in the same spirit as an invokedynamic call site only spins up the class on the first time it is run. `LambdaDeserializer` will be moved into our standard library in the 2.12.x branch, where we can introduce dependencies on the Java 8 standard library. The enclosed test cases must be manually run with indylambda enabled. Once we enable indylambda by default on 2.12.x, the test will actually test the new feature. ``` % echo $INDYLAMBDA -Ydelambdafy:method -Ybackend:GenBCode -target:jvm-1.8 -classpath .:scala-java8-compat_2.11-0.5.0-SNAPSHOT.jar % qscala $INDYLAMBDA -e "println((() => 42).getClass)" class Main$$anon$1$$Lambda$1/1183231938 % qscala $INDYLAMBDA -e "assert(classOf[scala.Serializable].isInstance(() => 42))" % qscalac $INDYLAMBDA test/files/run/lambda-serialization.scala && qscala $INDYLAMBDA Test ``` This commit contains a few minor refactorings to the code that generates the invokedynamic instruction to use more meaningful names and to reuse Java signature generation code in ASM rather than the DIY approach. [1] scala/scala-java8-compat#37

To support serialization, we use the alternative lambda metafactory that lets us specify that our anonymous functions should extend the marker interface `scala.Serializable`. They will also have a `writeObject` method added that implements the serialization proxy pattern using `j.l.invoke.SerializedLamba`. To support deserialization, we synthesize a `$deserializeLamba$` method in each class with lambdas. This will be called reflectively by `SerializedLambda#readResolve`. This method in turn delegates to `LambdaDeserializer`, currently defined [1] in `scala-java8-compat`, that uses `LambdaMetafactory` to spin up the anonymous class and instantiate it with the deserialized environment. Note: `LambdaDeserializer` can reuses the anonymous class on subsequent deserializations of a given lambda, in the same spirit as an invokedynamic call site only spins up the class on the first time it is run. But first we'll need to host a cache in a static field of each lambda hosting class. This is noted as a TODO and a failing test, and will be updated in the next commit. `LambdaDeserializer` will be moved into our standard library in the 2.12.x branch, where we can introduce dependencies on the Java 8 standard library. The enclosed test cases must be manually run with indylambda enabled. Once we enable indylambda by default on 2.12.x, the test will actually test the new feature. ``` % echo $INDYLAMBDA -Ydelambdafy:method -Ybackend:GenBCode -target:jvm-1.8 -classpath .:scala-java8-compat_2.11-0.5.0-SNAPSHOT.jar % qscala $INDYLAMBDA -e "println((() => 42).getClass)" class Main$$anon$1$$Lambda$1/1183231938 % qscala $INDYLAMBDA -e "assert(classOf[scala.Serializable].isInstance(() => 42))" % qscalac $INDYLAMBDA test/files/run/lambda-serialization.scala && qscala $INDYLAMBDA Test ``` This commit contains a few minor refactorings to the code that generates the invokedynamic instruction to use more meaningful names and to reuse Java signature generation code in ASM rather than the DIY approach. [1] scala/scala-java8-compat#37

Java support serialization of lambdas by using the serialization proxy pattern. Deserialization of a lambda uses `LambdaMetafactory` to create a new anonymous subclass. More details of the scheme are documented: https://docs.oracle.com/javase/8/docs/api/java/lang/invoke/SerializedLambda.html From those docs: > SerializedLambda has a readResolve method that looks for a > (possibly private) static method called $deserializeLambda$ > in the capturing class, invokes that with itself as the first > argument, and returns the result. Lambda classes implementing > $deserializeLambda$ are responsible for validating that the > properties of the SerializedLambda are consistent with a lambda > actually captured by that class. The Java compiler generates code in `$deserializeLambda$` that switches on the implementation method name and signature to locate an invokedynamic instruction generated for the particular lambda expression. Then, the `SerializedLambda` is further unpacked, validating that this implementation method still represents the same functional interface as it did when it was serialized. (The source may have been recompiled in the interim.) In Java, serializable lambda expressions are the exception rather than the rule. In Scala, however, the serializability of `FunctionN` means that we would end up generating a large amount of code to support deserialization. Instead, we are pursuing an alternative approach in which the `$deserializeLambda$` method is a simple forwarder to the generic deserializer added here. This is capable of deserializing lambdas created by the Java compiler, although this is not its intended use case. The enclosed tests use Java lambdas. This generic deserializer also works by calling `LambdaMetafactory`, but it does so explicitly, rather than implicitly during linkage of the `invokedynamic` instruction. We have to mimic the caching property of `invokedynamic` instruction to ensure we reuse the classes when constructing. I originally tried using a central cache, but wasn't able to come up with a scheme to avoid potential classloader memory leaks. Instead, I now allow the caller to provide a cache. The scala compiler will host an instance of this cache in each class that hosts a lambda. This is analagous the the `MethodCache` used by reflective calls. If the name or signature of the implementation method has changed, we fail during deserialization with an `IllegalArgumentError.` However, we do not fail fast in a few cases that Java would, as we cannot reflect on the "current" functional interface supported by this implementation method. We just instantiate using the "previous" functional interface class/method. This might: 1. fail inside `LambdaMetafactory` if the new implementation method is not compatible with the old functional interface. 2. pass through `LambdaMetafactory` by chance, but fail when instantiating the class in other cases. For example: ``` % tail sandbox/test{1,2}.scala ==> sandbox/test1.scala <== class C { def test: (String => String) = { val s: String = "" (t) => s + t } } ==> sandbox/test2.scala <== class C { def test: (String, String) => String = { (s, t) => s + t } } % (for i in 1 2; do scalac -Ydelambdafy:method -Xprint:delambdafy sandbox/test$i.scala 2>&1 ; done) | grep 'def $anon' final <static> <artifact> private[this] def $anonfun$1(t: String, s$1: String): String = s$1.+(t); final <static> <artifact> private[this] def $anonfun$1(s: String, t: String): String = s.+(t); ``` 3. Silently create an instance of the old functional interface. For example, imagine switching from `FuncInterface1` to `FuncInterface2` where these were identical other than the name. I don't believe that these are showstoppers. Failing test case demonstrating overly weak cache

LambdaMetafactory returns a ConstantCallSite bound to a shared instance of a lambda, rather than a reference to the no-arg constructor. This is a technique to avoid unnecessary allocations. This test checks that we preserve this property when deserializing.

lrytz · 2015-05-21T19:37:34Z

LGTM!

Add a generic deserializer for Java/Scala 2.12 lambdas

lrytz reviewed May 13, 2015
View reviewed changes

retronym force-pushed the topic/lambda-deserialize branch from 8c5d4ee to 6472976 Compare May 14, 2015 06:00

retronym mentioned this pull request May 15, 2015

[indylambda] Support lambda {de}serialization scala/scala#4501

Merged

lrytz reviewed May 15, 2015
View reviewed changes

retronym force-pushed the topic/lambda-deserialize branch from 6472976 to 921b212 Compare May 20, 2015 23:06

retronym added a commit that referenced this pull request May 22, 2015

Merge pull request #37 from retronym/topic/lambda-deserialize

aa0908b

Add a generic deserializer for Java/Scala 2.12 lambdas

retronym merged commit aa0908b into scala:master May 22, 2015

retronym added this to the 0.5.0 milestone May 22, 2015

retronym modified the milestones: 0.5.0, 0.6.0 Aug 13, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a generic deserializer for Java/Scala 2.12 lambdas #37

Add a generic deserializer for Java/Scala 2.12 lambdas #37

Uh oh!

retronym commented May 13, 2015

Uh oh!

retronym commented May 13, 2015

Uh oh!

lrytz May 13, 2015

Uh oh!

lrytz commented May 13, 2015

Uh oh!

retronym commented May 14, 2015

Uh oh!

lrytz May 15, 2015

Uh oh!

retronym May 15, 2015

Uh oh!

retronym commented May 15, 2015

Uh oh!

retronym commented May 15, 2015

Uh oh!

retronym commented May 15, 2015

Uh oh!

retronym commented May 15, 2015

Uh oh!

lrytz commented May 21, 2015

Uh oh!

Uh oh!

Add a generic deserializer for Java/Scala 2.12 lambdas #37

Add a generic deserializer for Java/Scala 2.12 lambdas #37

Uh oh!

Conversation

retronym commented May 13, 2015

Uh oh!

retronym commented May 13, 2015

Uh oh!

lrytz May 13, 2015

Choose a reason for hiding this comment

Uh oh!

lrytz commented May 13, 2015

Uh oh!

retronym commented May 14, 2015

Uh oh!

lrytz May 15, 2015

Choose a reason for hiding this comment

Uh oh!

retronym May 15, 2015

Choose a reason for hiding this comment

Uh oh!

retronym commented May 15, 2015

Uh oh!

retronym commented May 15, 2015

Uh oh!

retronym commented May 15, 2015

Uh oh!

retronym commented May 15, 2015

Uh oh!

lrytz commented May 21, 2015

Uh oh!

Uh oh!