Skip to content

Matches on union types do not infer most precise return type with wildcard matches #5420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
propensive opened this issue Nov 10, 2018 · 6 comments

Comments

@propensive
Copy link
Contributor

propensive commented Nov 10, 2018

Considering a match on a union type, such as,

def unify(x: Int | String) = x match {
  case i: Int => "one"
  case other => other
}

the compiler will infer the return type as Any, being the LUB of String (from the first case clause) and Int | String (from the wildcard).

But it could calculate the precise return type, String, as it already knows the types being handled by the wildcard match, which it would report as failures for an exhaustivity warning. It could union these with the return types of each other case clause.

Note that if there is no wildcard match in the match block, the typechecker will correctly calculate the precise type. As intended by design, it will infer a LUB, but will still correctly typecheck the more precise union type.

I see a particularly strong potential use case for this in error handling, for example, this,

def aggregateErrs(data: Either[FileError | DiskError | ValidationError, String]) =
  data.left.map {
    case FileError => IoError()
    case DiskError => IoError()
    case other => other
  }

could infer the type Either[IoError | ValidationError, String].

(The choice between automatically inferring a LUB or a union type, in the absence of an explicit ascription, is relevant in this example, but is not the issue I'm raising here. However in this example, my preference would be for the union type to be inferred.)

@smarter
Copy link
Member

smarter commented Nov 12, 2018

But it could calculate the precise return type, String, as it already knows the types being handled by the wildcard match, which it would report as failures for an exhaustivity warning

This is true assuming that exhaustivity checking is perfect, which is what we strive for, but it's not clear that this is achievable in the presence of GADTs: #4029

Of course, exhaustivity checking is easier when the selector is a union type, but then we'd be introducing a special-case in the language where the semantics of a wildcard pattern depend on what you're pattern matching against. Personally, I don't think that's worth it.

(The choice between automatically inferring a LUB or a union type, in the absence of an explicit ascription, is relevant in this example, but is not the issue I'm raising here. However in this example, my preference would be for the union type to be inferred.)

Yeah, we still haven't figured a good rule here, see the section "Type Inference" of http://dotty.epfl.ch/docs/reference/union-types-spec.html for a recap of the situation.

@propensive
Copy link
Contributor Author

Thanks for the feedback, @smarter! I think there would be pretty big advantages to being able to treat union types as a direct type-level representation of a set, with support for removals. (And the last part is something that Scala 2 can't do.)

But I get the point about GADTs. As I understand it, though, the special casing would be the same special-casing that already applies to the exhaustivity warnings. There would always be the fallback of the current behavior.

Anyway, I'll try to find a moment to ask @odersky about it at SBTB this week.

@Blaisorblade
Copy link
Contributor

As I understand it, though, the special casing would be the same special-casing that already applies to the exhaustivity warnings. There would always be the fallback of the current behavior.

Not quite. The options are:

  1. keep current behavior: other has the scrutinee's type. OK!
  2. always compute the type of other through the exhaustivity checker: that risks unsoundness if the exhaustivity checker is incomplete. NOT OK!
  3. have a different set of exhaustivity-checking-like rules to compute the type of other, that only work for union types but follow current rules for GADTs. CONFUSING?! (And IIUC that's @smarter's point).

OTOH, if you look at languages with union types where your example works, maybe option 3 seems more attractive, tho all the issues stand.

There is a fourth, but MUCH harder option to do in general (and pretty unlikely to happen): make subtype checking detect when we know it loses completeness. That could affect both exhaustivity warnings and typechecking in the same way. But it would both complicate typechecking (in a performance-critical method, of all places) and increase risks for soundness.

@LPTK
Copy link
Contributor

LPTK commented Nov 14, 2018

Here's what to strive for IMHO: the way TypeScript handles these things is really powerful but also really natural. It makes handling strongly-typed JSON data a breeze, as intersections and unions of structural types with literal types:

type Doc = FeatureDoc & Feature2Doc
type FeatureDoc =
  { has_feature: true; feature_settings: FeatureSetting } | { has_feature: false }
type Feature2Doc =
  { has_feature2: true; feature_settings2: FeatureSetting } | { has_feature2: false }
type FeatureSetting = { key: string }

function test(d: Doc) {
    switch (d.has_feature) {
        case true:
            console.log(d.feature_settings.key)
        case false:
            // console.log(d.feature_settings.key)
            // ^ type error: Property 'feature_settings' does not exist on this type
    }
}

test({ has_feature: true, feature_settings: { key: "ok" }, has_feature2: false })

// test({ has_feature: true, feature_settings: { key: "ok" }, has_feature2: true })
// ^ type error: Argument is not assignable to parameter of type 'Doc'

@Blaisorblade
Copy link
Contributor

@LPTK I remember you mentioned TypeScript (tho I wasn't sure of the details). Tho what this feature actually asks to support is closer to:

function test(d: Doc) {
    switch (d.has_feature) {
        case false:
            // console.log(d.feature_settings.key)
            // ^ type error: Property 'feature_settings' does not exist on this type
        case _:
            console.log(d.feature_settings.key)
    }
}

(and I can easily believe TypeScript could support this).

My answer to all this is:

OTOH, if you look at languages with union types where your example works, maybe option 3 seems more attractive, tho all the issues stand.

In any case, I'm just analyzing tradeoffs, not making a call.

==

Also, we don't have the basic "dependent typing" you describe, and that surely belongs to a separate issue, or more likely a Pre-SIP on contributors. It seems potentially cool for JSON, but it also seems to overlap even more with enums and case classes, and it's not obviously easy to add to Dotty (it could be comparable to GADTs in complexity).

I wonder a bit if it's interesting for @OlivierBlanvillain and @gsps's explorations.

@abgruszecki
Copy link
Contributor

Something to take into consideration: assume, for the sake of the argument, that we do special case the type of "trailing wildcard" when matching on an union. What happens if we match on a case class that has a union field? On a tuple that contains a union? On a tuple that contains multiple unions? It may be that special casing a typing rule when matching on a union will be confusing and inconsistent, and solving the general case without losing too much performance will be non-trivial.

Could this be better solved with a helper method instead? The best I can come up with on the spot is something like:

val union: Int | String | Float | Double = ???
union
  .extract[Int](i => "int")
  .extract[Float](f => "float") : Either[String | Double, String]

But there may be a better way to write this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants