Skip to content

Associative, commutative, Java-friendly intersection erasure #11808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion compiler/src/dotty/tools/dotc/core/TypeComparer.scala
Original file line number Diff line number Diff line change
Expand Up @@ -2128,7 +2128,7 @@ class TypeComparer(@constructorOnly initctx: Context) extends ConstraintHandling
else {
val t2 = distributeAnd(tp2, tp1)
if (t2.exists) t2
else if (isErased) erasedGlb(tp1, tp2, isJava = false)
else if (isErased) erasedGlb(tp1, tp2)
else liftIfHK(tp1, tp2, op, original, _ | _)
// The ` | ` on variances is needed since variances are associated with bounds
// not lambdas. Example:
Expand Down
113 changes: 87 additions & 26 deletions compiler/src/dotty/tools/dotc/core/TypeErasure.scala
Original file line number Diff line number Diff line change
Expand Up @@ -394,32 +394,93 @@ object TypeErasure {
}

/** The erased greatest lower bound of two erased type picks one of the two argument types.
* It prefers, in this order:
* - arrays over non-arrays
* - subtypes over supertypes, unless isJava is set
* - real classes over traits
*
* This operation has the following the properties:
* - Associativity and commutativity, because this method acts as the minimum
* of the total order induced by `compareErasedGlb`.
* - Java compatibility: intersections that would be valid in Java code are
* erased like javac would erase them (a Java intersection is composed of
* exactly one class and one or more interfaces and always erases to the
* class).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our last discussion, hadn't we come to the conclusion that we should never return an Interface as the erasure of an intersection type (unless an interface dominates everything else, of course)? IIRC the issue was that Object & T1 & T2 would otherwise have a different erasure than T1 & T2, even though Object & T1 & T2 can simplify to T1 & T2 and though Java would erase to Object.

Copy link
Member Author

@smarter smarter Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our last discussion, hadn't we come to the conclusion that we should never return an Interface as the erasure of an intersection type

Yes, I went with this version originally but decided it wasn't worth it in the end:

  • Many more types end up being erased to Object which can lead to overloading clashes (it happened in the community build for example), I fear this could be a big problem in the wild making cross-compilation difficult.
  • Admittedly, the fact that Object & T1 & T2 has a different erasure than T1 & T2 isn't great, but then so is the fact
    Object & T1 has a different erasure than T1.
  • It turns out that intersection of traits in Java signatures actually work.
  • Having erasedGlb(A, B) always return either A or B allows us to implement it as the minimum of a total order, which gives us commutativity and associativity for free and in general makes it easier to reason about what this method does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. IIUC, this is the critical aspect that allows this approach to work at all:

It turns out that intersection of traits in Java signatures actually work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's helpful, but if those Java signatures weren't allowed, we could also emit a signature containing just the first trait as an approximation, we already have to use approximated signatures in various situations where our type system is more expressive than Java's.

*/
def erasedGlb(tp1: Type, tp2: Type, isJava: Boolean)(using Context): Type = tp1 match {
case JavaArrayType(elem1) =>
tp2 match {
case JavaArrayType(elem2) => JavaArrayType(erasedGlb(elem1, elem2, isJava))
case _ => tp1
}
case _ =>
tp2 match {
case JavaArrayType(_) => tp2
case _ =>
val tsym1 = tp1.typeSymbol
val tsym2 = tp2.typeSymbol
if (!tsym2.exists) tp1
else if (!tsym1.exists) tp2
else if (!isJava && tsym1.derivesFrom(tsym2)) tp1
else if (!isJava && tsym2.derivesFrom(tsym1)) tp2
else if (tp1.typeSymbol.isRealClass) tp1
else if (tp2.typeSymbol.isRealClass) tp2
else tp1
}
}
def erasedGlb(tp1: Type, tp2: Type)(using Context): Type =
if compareErasedGlb(tp1, tp2) <= 0 then tp1 else tp2

/** Overload of `erasedGlb` to compare more than two types at once. */
def erasedGlb(tps: List[Type])(using Context): Type =
tps.min(using (a,b) => compareErasedGlb(a, b))

/** A comparison function that induces a total order on erased types,
* where `A <= B` implies that the erasure of `A & B` should be A.
*
* This order respects the following properties:
* - ErasedValueTypes <= non-ErasedValueTypes
* - arrays <= non-arrays
* - primitives <= non-primitives
* - real classes <= traits
* - subtypes <= supertypes
*
* Since this isn't enough to order to unrelated classes, we use
* lexicographic ordering of the class symbol full name as a tie-breaker.
* This ensure that `A <= B && B <= A` iff `A =:= B`.
*
* @see erasedGlb
*/
private def compareErasedGlb(tp1: Type, tp2: Type)(using Context): Int =
// this check is purely an optimization.
if tp1 eq tp2 then
return 0

val isEVT1 = tp1.isInstanceOf[ErasedValueType]
val isEVT2 = tp2.isInstanceOf[ErasedValueType]
if isEVT1 && isEVT2 then
return compareErasedGlb(tp1.asInstanceOf[ErasedValueType].tycon, tp2.asInstanceOf[ErasedValueType].tycon)
else if isEVT1 then
return -1
else if isEVT2 then
return 1

val isArray1 = tp1.isInstanceOf[JavaArrayType]
val isArray2 = tp2.isInstanceOf[JavaArrayType]
if isArray1 && isArray2 then
return compareErasedGlb(tp1.asInstanceOf[JavaArrayType].elemType, tp2.asInstanceOf[JavaArrayType].elemType)
else if isArray1 then
return -1
else if isArray2 then
return 1

val sym1 = tp1.classSymbol
val sym2 = tp2.classSymbol
def compareClasses: Int =
if sym1.isSubClass(sym2) then
-1
else if sym2.isSubClass(sym1) then
1
// Intentionally compare Strings and not Names, since the ordering on
// Names depends on implementation details like `NameKind#tag`.
else
sym1.fullName.toString.compareTo(sym2.fullName.toString)

val isPrimitive1 = sym1.isPrimitiveValueClass
val isPrimitive2 = sym2.isPrimitiveValueClass
if isPrimitive1 && isPrimitive2 then
return compareClasses
else if isPrimitive1 then
return -1
else if isPrimitive2 then
return 1

val isRealClass1 = sym1.isRealClass
val isRealClass2 = sym2.isRealClass
if isRealClass1 && isRealClass2 then
return compareClasses
else if isRealClass1 then
return -1
else if isRealClass2 then
return 1

compareClasses
end compareErasedGlb

/** Does the (possibly generic) type `tp` have the same erasure in all its
* possible instantiations?
Expand Down Expand Up @@ -522,7 +583,7 @@ class TypeErasure(sourceLanguage: SourceLanguage, semiEraseVCs: Boolean, isConst
if sourceLanguage.isScala2 then
this(Scala2Erasure.intersectionDominator(Scala2Erasure.flattenedParents(tp)))
else
erasedGlb(this(tp1), this(tp2), isJava = sourceLanguage.isJava)
erasedGlb(this(tp1), this(tp2))
case OrType(tp1, tp2) =>
if isSymbol && sourceLanguage.isScala2 && ctx.settings.scalajs.value then
// In Scala2Unpickler we unpickle Scala.js pseudo-unions as if they were
Expand Down
124 changes: 83 additions & 41 deletions compiler/src/dotty/tools/dotc/transform/GenericSignatures.scala
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ import core.Flags._
import core.Names.{DerivedName, Name, SimpleName, TypeName}
import core.Symbols._
import core.TypeApplications.TypeParamInfo
import core.TypeErasure.{erasure, isUnboundedGeneric}
import core.TypeErasure.{erasedGlb, erasure, isUnboundedGeneric}
import core.Types._
import core.classfile.ClassfileConstants
import ast.Trees._
Expand All @@ -19,6 +19,7 @@ import TypeUtils._
import java.lang.StringBuilder

import scala.annotation.tailrec
import scala.collection.mutable.ListBuffer

/** Helper object to generate generic java signatures, as defined in
* the Java Virtual Machine Specification, §4.3.4
Expand Down Expand Up @@ -65,18 +66,79 @@ object GenericSignatures {

def boxedSig(tp: Type): Unit = jsig(tp.widenDealias, primitiveOK = false)

/** The signature of the upper-bound of a type parameter.
*
* @pre none of the bounds are themselves type parameters.
* TODO: Remove this restriction so we can support things like:
*
* class Foo[A]:
* def foo[S <: A & Object](...): ...
*
* Which should emit a signature `S <: A`. See the handling
* of `AndType` in `jsig` which already supports `def foo(x: A & Object)`.
*/
def boundsSig(bounds: List[Type]): Unit = {
val (isTrait, isClass) = bounds partition (_.typeSymbol.is(Trait))
isClass match {
case Nil => builder.append(':') // + boxedSig(ObjectTpe)
case x :: _ => builder.append(':'); boxedSig(x)
}
isTrait.foreach { tp =>
val (repr :: _, others) = splitIntersection(bounds)
builder.append(':')

// According to the Java spec
// (https://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.4),
// intersections erase to their first member and must start with a class.
// So, if our intersection erases to a trait, in theory we should emit
// just that trait in the generic signature even if the intersection type
// is composed of multiple traits. But in practice Scala 2 has always
// ignored this restriction as intersections of traits seem to be handled
// correctly by javac, we do the same here since type soundness seems
// more important than adhering to the spec.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't be a problem at all if our intersection erasure was only a trait when that trait dominates the intersection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but I don't think that's a good enough reason to use that scheme as outlined in my other comment.

if repr.classSymbol.is(Trait) then
builder.append(':')
boxedSig(repr)
// If we wanted to be compliant with the spec, we would `return` here.
else
boxedSig(repr)
others.filter(_.classSymbol.is(Trait)).foreach { tp =>
builder.append(':')
boxedSig(tp)
}
}

/** The parents of this intersection where type parameters
* that cannot appear in the signature have been replaced
* by their upper-bound.
*/
def flattenedIntersection(tp: AndType)(using Context): List[Type] =
val parents = ListBuffer[Type]()

def collect(parent: Type, parents: ListBuffer[Type]): Unit = parent.widenDealias match
case AndType(tp1, tp2) =>
collect(tp1, parents)
collect(tp2, parents)
case parent: TypeRef =>
if parent.symbol.isClass || isTypeParameterInSig(parent.symbol, sym0) then
parents += parent
else
collect(parent.superType, parents)
case parent =>
parents += parent

collect(tp, parents)
parents.toList
end flattenedIntersection

/** Split the `parents` of an intersection into two subsets:
* those whose individual erasure matches the overall erasure
* of the intersection and the others.
*/
def splitIntersection(parents: List[Type])(using Context): (List[Type], List[Type]) =
val erasedParents = parents.map(erasure)
val erasedCls = erasedGlb(erasedParents).classSymbol
parents.zip(erasedParents)
.partitionMap((parent, erasedParent) =>
if erasedParent.classSymbol eq erasedCls then
Left(parent)
else
Right(parent))

def paramSig(param: LambdaParam): Unit = {
builder.append(sanitizeName(param.paramName))
boundsSig(hiBounds(param.paramInfo.bounds))
Expand Down Expand Up @@ -263,8 +325,20 @@ object GenericSignatures {
builder.append(')')
methodResultSig(restpe)

case AndType(tp1, tp2) =>
jsig(intersectionDominator(tp1 :: tp2 :: Nil), primitiveOK = primitiveOK)
case tp: AndType =>
// Only intersections appearing as the upper-bound of a type parameter
// can be preserved in generic signatures and those are already
// handled by `boundsSig`, so here we fallback to picking a parent of
// the intersection to determine its overall signature. We must pick a
// parent whose erasure matches the erasure of the intersection
// because javac relies on the generic signature to determine the
// bytecode signature. Additionally, we prefer picking a type
// parameter since that will likely lead to a more precise type.
val parents = flattenedIntersection(tp)
val (reprParents, _) = splitIntersection(parents)
val repr =
reprParents.find(_.typeSymbol.is(TypeParam)).getOrElse(reprParents.head)
jsig(repr, primitiveOK = primitiveOK)

case ci: ClassInfo =>
def polyParamSig(tparams: List[TypeParamInfo]): Unit =
Expand Down Expand Up @@ -308,38 +382,6 @@ object GenericSignatures {

private class UnknownSig extends Exception

/** The intersection dominator (SLS 3.7) of a list of types is computed as follows.
*
* - If the list contains one or more occurrences of scala.Array with
* type parameters El1, El2, ... then the dominator is scala.Array with
* type parameter of intersectionDominator(List(El1, El2, ...)). <--- @PP: not yet in spec.
* - Otherwise, the list is reduced to a subsequence containing only the
* types that are not supertypes of other listed types (the span.)
* - If the span is empty, the dominator is Object.
* - If the span contains a class Tc which is not a trait and which is
* not Object, the dominator is Tc. <--- @PP: "which is not Object" not in spec.
* - Otherwise, the dominator is the first element of the span.
*/
private def intersectionDominator(parents: List[Type])(using Context): Type =
if (parents.isEmpty) defn.ObjectType
else {
val psyms = parents map (_.typeSymbol)
if (psyms contains defn.ArrayClass)
// treat arrays specially
defn.ArrayType.appliedTo(intersectionDominator(parents.filter(_.typeSymbol == defn.ArrayClass).map(t => t.argInfos.head)))
else {
// implement new spec for erasure of refined types.
def isUnshadowed(psym: Symbol) =
!(psyms exists (qsym => (psym ne qsym) && (qsym isSubClass psym)))
val cs = parents.iterator.filter { p => // isUnshadowed is a bit expensive, so try classes first
val psym = p.classSymbol
psym.ensureCompleted()
psym.isClass && !psym.is(Trait) && isUnshadowed(psym)
}
(if (cs.hasNext) cs else parents.iterator.filter(p => isUnshadowed(p.classSymbol))).next()
}
}

/* Drop redundant types (ones which are implemented by some other parent) from the immediate parents.
* This is important on Android because there is otherwise an interface explosion.
*/
Expand Down
Loading