Skip to content

Unification across intersection of conflicting base types is asymmetrical #4214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
propensive opened this issue Mar 29, 2018 · 15 comments
Closed

Comments

@propensive
Copy link
Contributor

propensive commented Mar 29, 2018

I noticed this in real code in Scala 2.x, and it seems to exist in Dotty (0.7.0-RC1), too, though I haven't explored whether it can lead to a genuine unsoundness issue... but it at least breaks an invariant I wouldn't expect:

trait F[T]
def foo[T](f: F[T])(g: T) = g

We can then successfully call,

foo(null.asInstanceOf[F[String] & F[Int]])("")

but not,

foo(null.asInstanceOf[F[String] & F[Int]])(42)

It appears that when trying to instantiate the type parameter T from the intersection type, the unification algorithm simply takes the first match it finds without checking for subsequent matches in the intersection, rather than inferring the union type, String | Int, which I think would be the only consistent answer. The LUB of String and Int might be better, but it's still wrong if F is invariant in T.

Note that it is impossible to construct an instance of F[String] & F[Int], but I don't think that's sufficient protection...

@Blaisorblade Blaisorblade self-assigned this Mar 29, 2018
@Blaisorblade
Copy link
Contributor

First, rejecting the second is indeed an instance of type inference being incomplete. In dotr:

scala> foo[Int](null.asInstanceOf[F[String] & F[Int]])(42)
val res1: Int = 42

I convinced myself both calls are acceptable because F[String] & F[Int] <: F[String] and F[String] & F[Int] <: F[Int]. Let me show where the upcastings could likely be inserted:

scala> foo[String](null.asInstanceOf[F[String] & F[Int]]: F[String])("")
val res6: String = ""
scala> foo[Int](null.asInstanceOf[F[String] & F[Int]]: F[Int])(42)
val res7: Int = 42

@Blaisorblade
Copy link
Contributor

On type soundness

On soundness worries: we know null opens some holes (both here and in Java) and you need null tracking to close those, but it isn't trivial. OTOH, I'm not sure why this example would trigger worries about soundness. F[String] & F[Int] is equal to itself and F[Int] & F[String], but I think no other normal-form type, so there's no solution for T giving F[T] = F[String] & F[Int]; luckily function application only requires F[String] & F[Int] <: F[?T] — you need to be able to upcast the function argument to F[?T], where T is a unification variable to solve for.

On type inference completeness

The question which stands is "why is type inference incomplete here?" In short, IIUC it's unclear how to ensure completeness of unification (subtype constraint solving, really) without requiring backtracking and exponential worst-case performance, especially in the presence of intersection and union types. For covariant and contravariant types, you might be able to avoid backtracking by simplifying the intersection. That trick doesn't work here, but I'm not sure how much effort is relevant to support inference on empty types.

I think this might be one of the extensively documented incompleteness cases of incompleteness — Martin pointed me at some comments, I collected pointers in #4029 (which is about another concerning consequence). To be sure, I'd be tempted to do some (limited) amount of backtracking, but I'm not sure it's a good idea, and I'm not sure this is a compelling example for the need for backtracking.

Martin also explained me that he doesn't expect completeness and that "add annotations when it fails" is for him the expected usage mode, and claimed it is unavoidable with local type inference. How much should you wait to solve a constraint? Trying to infer them globally has its share of issues with subtyping. But Dotty's solver can wait longer than Scalac's.
(I have seen papers talking about some version of "complete" local type inference, for some definition of complete, but I don't think they'd support Scala directly or are easy to extend — https://arxiv.org/abs/1306.6032 and https://arxiv.org/abs/1601.05106 are the best I know).

@Blaisorblade
Copy link
Contributor

@propensive What do you think? Did I miss something (I might have), or is there some issue I didn't see? If you care about completeness, can you find an example with code you'd want to write, rather than code you'd want to forbid?

And is there a reason to keep this issue open, or should it be closed?

@Blaisorblade Blaisorblade removed their assignment Mar 29, 2018
@propensive
Copy link
Contributor Author

Thanks for the quick feedback, @Blaisorblade - it's provided a useful clarification for me, as it appears to be more of a question of the imperfect "leastness" of the upper-bound, rather than correctness, whereas I had assumed Dotty would would choose the unique least-upper bound. Am I right in saying that Dotty will never infer a union type?

In my real-world usage in Scala, I have a Try-like monad which tracks types of exceptions captured using an type formed from the intersection of some number of types wrapped in an invariant type constructor. I can form these intersections automatically, and use a subtype check to make sure that my handler handles all the cases, so that I have totality. The trick seems to work surprisingly well, and the only errors I've seen in Scala have been in reporting that a certain type was expected, when one of several types would be acceptable, which is a manifestation of exactly this issue. In Scalac, I'm using the types to track this, but under the hood, it's typed as Any, and it's implemented with a runtime type test and an asInstanceOf.

In Dotty, that would probably just be replaced with a straight union type, so that shouldn't be a problem. Though as far as I can tell, Dotty's exhaustivity checking is completely absent at the moment, even for sealed traits... do you know if this is being tracked somewhere?

@propensive
Copy link
Contributor Author

And I'll close the issue on the basis of your answer, and because I appreciate the potentially-exponential cost of calculating a precise LUB, and can be reasonably confident that won't change. ;)

@smarter
Copy link
Member

smarter commented Mar 29, 2018

Am I right in saying that Dotty will never infer a union type?

When instantiating a type variable, the rule currently is https://github.com/lampepfl/dotty/blob/b6dbe42fbccf111becfdaed1806075d798621a7c/compiler/src/dotty/tools/dotc/core/ConstraintHandling.scala#L288-L292

When inferring the type of a definition, we simplify unions by https://github.com/lampepfl/dotty/blob/b6dbe42fbccf111becfdaed1806075d798621a7c/compiler/src/dotty/tools/dotc/core/TypeOps.scala#L116-L127

I've been toying with the idea of a precise keyword so you could do precise val x = ... and def foo[precise T](...) and be guaranteed that no simplification took place, but I don't know if it's a useful enough feature to include in the language.

@smarter
Copy link
Member

smarter commented Mar 29, 2018

Though as far as I can tell, Dotty's exhaustivity checking is completely absent at the moment, even for sealed traits...

I don't understand what you mean here, Dotty has extremely powerful exhaustivity checking, better than scalac in many cases. Bug reports welcome.

@propensive
Copy link
Contributor Author

Thanks for the explanation, and in particular the enlightenment that these comments exist in the source!

And I think precise would be a fantastic feature!

Is the exhaustivity checker off by default in 0.7.0-RC1 or in the REPL? I'm just not seeing any warnings...

@smarter
Copy link
Member

smarter commented Mar 29, 2018

Is the exhaustivity checker off by default in 0.7.0-RC1 or in the REPL?

Looks like a REPL bug.

@smarter
Copy link
Member

smarter commented Mar 29, 2018

Opened #4217.

@smarter smarter reopened this Mar 29, 2018
@smarter smarter closed this as completed Mar 29, 2018
@Blaisorblade
Copy link
Contributor

Blaisorblade commented Mar 29, 2018

Thanks for the quick feedback, @Blaisorblade - it's provided a useful clarification for me, as it appears to be more of a question of the imperfect "leastness" of the upper-bound, rather than correctness, whereas I had assumed Dotty would would choose the unique least-upper bound.

(EDITED)
@propensive A bit confused here: are you suggesting (above and in the OP) that F[String] & F[Int] <: F[String | Int] when you talk about upper bounds? That’s my best understanding, but sounds wrong: while String <: String | Int, there’s no relation between F[String] and F[String | Int] because F is invariant.

ADDED: I suspect the unique least upper bound between F[String] and F[Int] (*excluding their union) is AnyRef, because it’s the parent of F[T] so it’s an upper bound, and because F[String] <: F[?X] and F[Int] <: F[?X] lead to the unsatisfiable String = ?X = Int by invariance.

ADDITION 2: For similar reasons, the GLB of F[String] and F[Int] (excluding the intersection) should be Nothing. Not sure that’s relevant, but right now I only see one possible simplification that uses unions in a similar way: namely, replacing intersections A & B by a GLB of A and B, dually to the union simplification code that @smarter has shown. I’m not sure Dotty does that simplification, either (but I wouldn’t know, yet) — IIUC, unions and intersections are handled in non-dual ways in inference for various reasons.

@propensive
Copy link
Contributor Author

That was my intuition speaking overconfidently... but it might be worth me going over my "logic" step-by-step, so I can get a better understanding. Everything I know about types comes from experimentation, so I appreciate any feedback!

  1. F[T] represents all values which satisfy a set of properties which are written in terms of T, for any T.
  2. F[A] & F[B] represents all values with the same set of properties, each of which should be satisfied both when T = A or when T = B.
  3. This appears to be saying, "when T = A | B", however in my logical leap of assumption I have ignored that each type is templated in terms of a single type, T, and functional dependencies may exist between properties of that same type (in both covariant and contravariant positions), where substitution of A everywhere or B everywhere will work, but A | B will, in general, not.

That was a useful exercise - thanks for instigating it!

The asymmetry in LUBs/GLBs in inference is particularly interesting to me, as I'm often making use of LUBs and GLBs to construct new types, and I rely heavily on them working differently in Scala... and having union types ("the union of the set of all the types") as well as supertype LUBs ("the intersection of the set of all the supertypes") confuses me (and presents the possibility of a number of in-between types formed from some hybrid of the two), while in the dual world, we only use intersection types for the GLB (i.e. "the intersection of the set of all the types") rather than subtype GLBs ("the intersection of the set of all the subtypes"). But we construct types from the top-down, with the interesting types towards the bottom, so we don't usually get to work with many types which have useful common subtypes. Anyway... I'll carry on thinking about this to myself. :)

@smarter
Copy link
Member

smarter commented Mar 29, 2018

replacing intersections A & B by a GLB of A and B, dually to the union simplification code that @smarter has shown. I’m not sure Dotty does that simplification, either (but I wouldn’t know, yet)

No, it doesn't. We approximate unions because they're generally too precise and not what you want to work with (just like we widen singleton types), no such problem arise with intersections.

@smarter
Copy link
Member

smarter commented Mar 29, 2018

And I think precise would be a fantastic feature!

I'd be interested in seeing usecases for such a thing, let me know if you come up with some.

@propensive
Copy link
Contributor Author

I think we discussed on a different ticket that final val causes a singleton type to be inferred, even though this has nothing to do with finality. So I would suggest changing the behavior of final to "impossible to override" and requiring precise if you want to infer a singleton.

I think that particular issue of @liufengyun's (which got closed, IIRC) was a potential use case itself, where I proposed final as the keyword to use to indicate "do not widen", even though that's not what "final" means.

Secondly, I would like to have,

precise def foo(x: Boolean) = if(x) "42" else 42

infer String | Int. That one is maybe the most obvious, when a choice needs to be made between the supertype-LUB and the union-LUB.

It might also be nice (or consistent) to have enum constructors, which I believe are now—unlke in Scala—typed as the enum's supertype, to infer the precise enum type if it's labelled precise, so I'd like to see:

val x = Some("") // infer Option[String]
precise val x = Some("") // infer Some[String]

As I understand it in Scalac, widening happens only during type inference, and only at the point where a signature needs to be calculated (i.e. not during the instantiation of a type parameter during intermediate typechecking of expressions), so I think the desired behavior should be as simple as to not do the widening operation when inferring a signature marked as precise. The thing I like about it is that it would be absolutely clear what it does.

These aren't exactly real-world usecases, though... I have some in my Scala code, but next time I try to convert it to Dotty, I'll try to think about it in terms of this feature being a possibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants