Skip to content

Commit 6d16acf

Browse files
authored
Merge pull request scala#2 from sjrd/spec-union-types
Integrate the spec of union and intersection types.
2 parents d1bce3a + 52ed944 commit 6d16acf

File tree

8 files changed

+178
-362
lines changed

8 files changed

+178
-362
lines changed

docs/_spec/03-types.md

Lines changed: 90 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,9 @@ chapter: 3
88

99
```ebnf
1010
Type ::= FunctionArgTypes ‘=>’ Type
11-
| InfixType [ExistentialClause]
11+
| InfixType
1212
FunctionArgTypes ::= InfixType
1313
| ‘(’ [ ParamType {‘,’ ParamType } ] ‘)’
14-
ExistentialClause ::= ‘forSome’ ‘{’ ExistentialDcl
15-
{semi ExistentialDcl} ‘}’
16-
ExistentialDcl ::= ‘type’ TypeDcl
17-
| ‘val’ ValDcl
1814
InfixType ::= CompoundType {id [nl] CompoundType}
1915
CompoundType ::= AnnotType {‘with’ AnnotType} [Refinement]
2016
| Refinement
@@ -325,12 +321,64 @@ An _infix type_ ´T_1´ `op` ´T_2´ consists of an infix operator `op` which ge
325321
The type is equivalent to the type application `op`´[T_1, T_2]´.
326322
The infix operator `op` may be an arbitrary identifier.
327323

328-
All type infix operators have the same precedence; parentheses have to be used for grouping.
329-
The [associativity](06-expressions.html#prefix,-infix,-and-postfix-operations) of a type operator is determined as for term operators: type operators ending in a colon ‘:’ are right-associative; all other operators are left-associative.
324+
Type operators follow the same [precedence and associativity as term operators](06-expressions.html#prefix-infix-and-postfix-operations).
325+
For example, `A + B * C` parses as `A + (B * C)` and `A | B & C` parses as `A | (B & C)`.
326+
Type operators ending in a colon ‘:’ are right-associative; all other operators are left-associative.
330327

331328
In a sequence of consecutive type infix operations ´t_0 \, \mathit{op} \, t_1 \, \mathit{op_2} \, ... \, \mathit{op_n} \, t_n´, all operators ´\mathit{op}\_1, ..., \mathit{op}\_n´ must have the same associativity.
332329
If they are all left-associative, the sequence is interpreted as ´(... (t_0 \mathit{op_1} t_1) \mathit{op_2} ...) \mathit{op_n} t_n´, otherwise it is interpreted as ´t_0 \mathit{op_1} (t_1 \mathit{op_2} ( ... \mathit{op_n} t_n) ...)´.
333330

331+
The type operators `|` and `&` are not really special.
332+
Nevertheless, unless shadowed, they resolve to `scala.|` and `scala.&`, which represent [union and intersection types](#union-and-intersection-types), respectively.
333+
334+
### Union and Intersection Types
335+
336+
Syntactically, the types `S | T` and `S & T` are infix types, where the infix operators are `|` and `&`, respectively (see above).
337+
338+
However, in this specification, ´S | T´ and ´S & T´ refer to the underlying core concepts of *union and intersection types*, respectively.
339+
340+
- The type ´S | T´ represents the set of values that are represented by *either* ´S´ or ´T´.
341+
- The type ´S & T´ represents the set of values that are represented by *both* ´S´ and ´T´.
342+
343+
From the [conformance rules](#conformance) rules on union and intersection types, we can show that ´&´ and ´|´ are *commutative* and *associative*.
344+
Moreover, `` is distributive over ``.
345+
For any type ´A´, ´B´ and ´C´, all of the following relationships hold:
346+
347+
- ´A & B \equiv B & A´,
348+
- ´A | B \equiv B | A´,
349+
- ´(A & B) & C \equiv A & (B & C)´,
350+
- ´(A | B) | C \equiv A | (B | C)´, and
351+
- ´A & (B | C) \equiv (A & B) | (A & C)´.
352+
353+
If ´C´ is a type constructor, then ´C[A] & C[B]´ can be simplified using the following three rules:
354+
355+
- If ´C´ is covariant, ´C[A] & C[B] \equiv C[A & B]´
356+
- If ´C´ is contravariant, ´C[A] & C[B] \equiv C[A | B]´
357+
- If ´C´ is invariant, emit a compile error
358+
359+
From the above rules, we can derive the following conformance relationships:
360+
361+
- When ´C´ is covariant, ´C[A & B] <: C[A] & C[B]´.
362+
- When ´C´ is contravariant, ´C[A | B] <: C[A] & C[B]´.
363+
364+
#### Join of a union type
365+
366+
In some situations, a union type might need to be widened to a non-union type.
367+
For this purpose, we define the _join_ of a union type ´T_1 | ... | T_n´ as the smallest intersection type of base class instances of ´T_1, ..., T_n´.
368+
Note that union types might still appear as type arguments in the resulting type, this guarantees that the join is always finite.
369+
370+
For example, given
371+
372+
```scala
373+
trait C[+T]
374+
trait D
375+
trait E
376+
class A extends C[A] with D
377+
class B extends C[B] with D with E
378+
```
379+
380+
The join of ´A | B´ is ´C[A | B] & D´
381+
334382
### Function Types
335383

336384
```ebnf
@@ -574,6 +622,8 @@ If ´T´ is a possibly parameterized class type, where ´T´'s class is defined
574622
1. The _member bindings_ of a type ´T´ are
575623
1. all bindings ´d´ such that there exists a type instance of some class ´C´ among the base types of ´T´ and there exists a definition or declaration ´d'´ in ´C´ such that ´d´ results from ´d'´ by replacing every type ´T'´ in ´d'´ by ´T'´ in ´C´ seen from ´T´, and
576624
2. all bindings of the type's [refinement](#compound-types), if it has one.
625+
2. The member bindinds of ´S & T´ are all the binds of ´S´ *and* all the bindins of ´T´.
626+
3. The member bindings of ´S | T´ are the member bindings of its [join](#join-of-a-union-type).
577627

578628
The _definition_ of a type projection `S#T` is the member binding ´d_T´ of the type `T` in `S`.
579629
In that case, we also say that `S#T` _is defined by_ ´d_T´.
@@ -630,6 +680,10 @@ The conformance relation ´(<:)´ is the smallest transitive relation that satis
630680
1. If the ´i´'th type parameter of ´T´ is declared neither covariant nor contravariant, then ´U_i \equiv T_i´.
631681
- A compound type `´T_1´ with ... with ´T_n´ {´R\,´}` conforms to each of its component types ´T_i´.
632682
- If ´T <: U_i´ for ´i \in \{ 1, ..., n \}´ and for every binding ´d´ of a type or value ´x´ in ´R´ there exists a member binding of ´x´ in ´T´ which subsumes ´d´, then ´T´ conforms to the compound type `´U_1´ with ... with ´U_n´ {´R\,´}`.
683+
- If ´T <: U´, then ´T <: U | W´ and ´T <: W | U´.
684+
- If ´T <: W´ and ´U <: W´, then ´T | U <: W´.
685+
- If ´T <: U´ and ´T <: W´, then ´T <: U & W´.
686+
- If ´T <: W´, then ´T & U <: W´ and ´U & T <: W´.
633687
- If ´T_i \equiv T_i'´ for ´i \in \{ 1, ..., n\}´ and ´U´ conforms to ´U'´ then the method type ´(p_1:T_1, ..., p_n:T_n) U´ conforms to ´(p_1':T_1', ..., p_n':T_n') U'´.
634688
- The polymorphic type ´[a_1 >: L_1 <: U_1, ..., a_n >: L_n <: U_n] T´ conforms to the polymorphic type ´[a_1 >: L_1' <: U_1', ..., a_n >: L_n' <: U_n'] T'´ if, assuming ´L_1' <: a_1 <: U_1', ..., L_n' <: a_n <: U_n'´ one has ´T <: T'´ and ´L_i <: L_i'´ and ´U_i' <: U_i´ for ´i \in \{ 1, ..., n \}´.
635689
- Type constructors ´T´ and ´T'´ follow a similar discipline.
@@ -648,28 +702,15 @@ A declaration or definition in some compound type of class type ´C´ _subsumes_
648702
- A type declaration `type ´t´[´T_1´, ..., ´T_n´] >: ´L´ <: ´U´` subsumes a type declaration `type ´t´[´T_1´, ..., ´T_n´] >: ´L'´ <: ´U'´` if ´L' <: L´ and ´U <: U'´.
649703
- A type or class definition that binds a type name ´t´ subsumes an abstract type declaration `type t[´T_1´, ..., ´T_n´] >: L <: U` if ´L <: t <: U´.
650704

651-
652705
#### Least upper bounds and greatest lower bounds
706+
653707
The ´(<:)´ relation forms pre-order between types, i.e. it is transitive and reflexive.
654708
This allows us to define _least upper bounds_ and _greatest lower bounds_ of a set of types in terms of that order.
655-
The least upper bound or greatest lower bound of a set of types does not always exist.
656-
For instance, consider the class definitions:
657-
658-
```scala
659-
class A[+T] {}
660-
class B extends A[B]
661-
class C extends A[C]
662-
```
663-
664-
Then the types `A[Any], A[A[Any]], A[A[A[Any]]], ...` form a descending sequence of upper bounds for `B` and `C`.
665-
The least upper bound would be the infinite limit of that sequence, which does not exist as a Scala type.
666-
Since cases like this are in general impossible to detect, a Scala compiler is free to reject a term which has a type specified as a least upper or greatest lower bound, and that bound would be more complex than some compiler-set limit [^4].
667709

668-
The least upper bound or greatest lower bound might also not be unique.
669-
For instance `A with B` and `B with A` are both greatest lower bounds of `A` and `B`.
670-
If there are several least upper bounds or greatest lower bounds, the Scala compiler is free to pick any one of them.
710+
- the _least upper bound_ of `A` and `B` is the smallest type `L` such that `A` <: `L` and `B` <: `L`.
711+
- the _greatest lower bound_ of `A` and `B` is the largest type `G` such that `G` <: `A` and `G` <: `B`.
671712

672-
[^4]: The current Scala compiler limits the nesting level of parameterization in such bounds to be at most two deeper than the maximum nesting level of the operand types
713+
By construction, for all types `A` and `B`, the least upper bound of `A` and `B` is `A | B`, and their greatest lower bound is `A & B`.
673714

674715
### Weak Conformance
675716

@@ -747,7 +788,29 @@ The erasure mapping is defined as follows.
747788
- The erasure of a singleton type `´p´.type` is the erasure of the type of ´p´.
748789
- The erasure of a type projection `´T´#´x´` is `|´T´|#´x´`.
749790
- The erasure of a compound type `´T_1´ with ... with ´T_n´ {´R\,´}` is the erasure of the intersection dominator of ´T_1, ..., T_n´.
791+
- The erasure of a union type ´S | T´ is the _erased least upper bound_ (_elub_) of the erasures of ´S´ and ´T´.
792+
- The erasure of an intersection type ´S & T´ is the _eglb_ (erased greatest lower bound) of the erasures of ´S´ and ´T´.
793+
794+
The erased LUB is computed as follows:
795+
796+
- if both argument are arrays of objects, an array of the erased LUB of the element types
797+
- if both arguments are arrays of same primitives, an array of this primitive
798+
- if one argument is array of primitives and the other is array of objects, [`Object`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Object.html)
799+
- if one argument is an array, [`Object`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Object.html)
800+
- otherwise a common superclass or trait S of the argument classes, with the following two properties:
801+
- S is minimal: no other common superclass or trait derives from S, and
802+
- S is last: in the linearization of the first argument type ´|A|´ there are no minimal common superclasses or traits that come after S.
803+
The reason to pick last is that we prefer classes over traits that way, which leads to more predictable bytecode and (?) faster dynamic dispatch.
750804

751-
The _intersection dominator_ of a list of types ´T_1, ..., T_n´ is computed as follows.
752-
Let ´T_{i_1}, ..., T_{i_m}´ be the subsequence of types ´T_i´ which are not supertypes of some other type ´T_j´.
753-
If this subsequence contains a type designator ´T_c´ that refers to a class which is not a trait, the intersection dominator is ´T_c´. Otherwise, the intersection dominator is the first element of the subsequence, ´T_{i_1}´.
805+
The rules for ´eglb(A, B)´ are given below in pseudocode:
806+
807+
```
808+
eglb(scala.Array[A], JArray[B]) = scala.Array[eglb(A, B)]
809+
eglb(scala.Array[T], _) = scala.Array[T]
810+
eglb(_, scala.Array[T]) = scala.Array[T]
811+
eglb(A, B) = A if A extends B
812+
eglb(A, B) = B if B extends A
813+
eglb(A, _) = A if A is not a trait
814+
eglb(_, B) = B if B is not a trait
815+
eglb(A, _) = A // use first
816+
```

docs/_spec/08-pattern-matching.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -113,10 +113,10 @@ Taking XML as an example
113113
implicit class XMLinterpolation(s: StringContext) = {
114114
object xml {
115115
def apply(exprs: Any*) =
116-
// parse ‘s’ and build an XML tree with ‘exprs’
116+
// parse ‘s’ and build an XML tree with ‘exprs’
117117
//in the holes
118118
def unapplySeq(xml: Node): Option[Seq[Node]] =
119-
// match `s’ against `xml’ tree and produce
119+
// match `s’ against `xml’ tree and produce
120120
//subtrees in holes
121121
}
122122
}
@@ -503,7 +503,7 @@ Each case consists of a (possibly guarded) pattern ´p_i´ and a block ´b_i´.
503503
Each ´p_i´ might be complemented by a guard `if ´e´` where ´e´ is a boolean expression.
504504
The scope of the pattern variables in ´p_i´ comprises the pattern's guard and the corresponding block ´b_i´.
505505

506-
Let ´T´ be the type of the selector expression ´e´ and let ´a_1, ..., a_m´ be the type parameters of all methods enclosing the pattern matching expression.
506+
Let ´T´ be the type of the selector expression ´e´ and let ´a_1, ..., a_m´ be the type parameters of all methods enclosing the pattern matching expression.
507507
For every ´a_i´, let ´L_i´ be its lower bound and ´U_i´ be its higher bound.
508508
Every pattern ´p \in \{p_1,, ..., p_n\}´ can be typed in two ways.
509509
First, it is attempted to type ´p´ with ´T´ as its expected type.
@@ -537,7 +537,7 @@ In the interest of efficiency the evaluation of a pattern matching expression ma
537537
This might affect evaluation through side effects in guards.
538538
However, it is guaranteed that a guard expression is evaluated only if the pattern it guards matches.
539539

540-
If the selector of a pattern match is an instance of a [`sealed` class](05-classes-and-objects.html#modifiers), the compilation of pattern matching can emit warnings which diagnose that a given set of patterns is not exhaustive, i.e. that there is a possibility of a `MatchError` being raised at run-time.
540+
If the selector of a pattern match is an instance of a [`sealed` class](05-classes-and-objects.html#modifiers), a [union type](03-types#union-and-intersection-types), or a combination thereof, the compilation of pattern matching can emit warnings which diagnose that a given set of patterns is not exhaustive, i.e. that there is a possibility of a `MatchError` being raised at run-time.
541541

542542
###### Example
543543

@@ -569,7 +569,7 @@ def eval[T](t: Term[T]): T = t match {
569569

570570
Note that the evaluator makes crucial use of the fact that type parameters of enclosing methods can acquire new bounds through pattern matching.
571571

572-
For instance, the type of the pattern in the second case, `Succ(u)`, is `Int`.
572+
For instance, the type of the pattern in the second case, `Succ(u)`, is `Int`.
573573
It conforms to the selector type `T` only if we assume an upper and lower bound of `Int` for `T`.
574574
Under the assumption `Int <: T <: Int` we can also verify that the type right hand side of the second case, `Int` conforms to its expected type, `T`.
575575

docs/_spec/A2-scala-2-compatibility.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,4 +32,9 @@ def f(): Unit = { ... }
3232
Scala 3 accepts the old syntax under the `-source:3.0-migration` option.
3333
If the `-migration` option is set, it can even rewrite old syntax to new.
3434
The [Scalafix](https://scalacenter.github.io/scalafix/) tool also
35-
can rewrite procedure syntax to make it Scala 3 compatible.
35+
can rewrite procedure syntax to make it Scala 3 compatible.
36+
37+
## Compound Types (`with`)
38+
39+
Intersection types `A & B` replace compound types `A with B` in Scala 2.
40+
For the moment, the syntax `A with B` is still allowed and interpreted as `A & B`, but its usage as a type (as opposed to in a `new` or `extends` clause) will be deprecated and removed in the future.

docs/_spec/TODOreference/new-types/intersection-types.md renamed to docs/_spec/APPLIEDreference/new-types/intersection-types.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,3 @@ to give at that point a definition of a `children` method with the required type
6363
class C extends A, B:
6464
def children: List[A & B] = ???
6565
```
66-
67-
68-
[More details](./intersection-types-spec.md)
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
---
2+
layout: doc-page
3+
title: "Union Types"
4+
nightlyOf: https://docs.scala-lang.org/scala3/reference/new-types/union-types.html
5+
---
6+
7+
A union type `A | B` has as values all values of type `A` and also all values of type `B`.
8+
9+
10+
```scala
11+
case class UserName(name: String)
12+
case class Password(hash: Hash)
13+
14+
def help(id: UserName | Password) =
15+
val user = id match
16+
case UserName(name) => lookupName(name)
17+
case Password(hash) => lookupPassword(hash)
18+
...
19+
```
20+
21+
Union types are duals of intersection types. `|` is _commutative_:
22+
`A | B` is the same type as `B | A`.
23+
24+
The compiler will assign a union type to an expression only if such a
25+
type is explicitly given. This can be seen in the following [REPL](https://docs.scala-lang.org/overviews/repl/overview.html) transcript:
26+
27+
```scala
28+
scala> val password = Password(123)
29+
val password: Password = Password(123)
30+
31+
scala> val name = UserName("Eve")
32+
val name: UserName = UserName(Eve)
33+
34+
scala> if true then name else password
35+
val res2: Object = UserName(Eve)
36+
37+
scala> val either: Password | UserName = if true then name else password
38+
val either: Password | UserName = UserName(Eve)
39+
```
40+
41+
The type of `res2` is `Object & Product`, which is a supertype of
42+
`UserName` and `Password`, but not the least supertype `Password |
43+
UserName`. If we want the least supertype, we have to give it
44+
explicitly, as is done for the type of `either`.
45+
46+
## Type inference
47+
48+
When inferring the result type of a definition (`val`, `var`, or `def`) and the
49+
type we are about to infer is a union type, then we replace it by its join.
50+
Similarly, when instantiating a type argument, if the corresponding type
51+
parameter is not upper-bounded by a union type and the type we are about to
52+
instantiate is a union type, we replace it by its join. This mirrors the
53+
treatment of singleton types which are also widened to their underlying type
54+
unless explicitly specified. The motivation is the same: inferring types
55+
which are "too precise" can lead to unintuitive typechecking issues later on.
56+
57+
**Note:** Since this behavior limits the usability of union types, it might
58+
be changed in the future. For example by not widening unions that have been
59+
explicitly written down by the user and not inferred, or by not widening a type
60+
argument when the corresponding type parameter is covariant.
61+
62+
See [PR #2330](https://github.com/lampepfl/dotty/pull/2330) and
63+
[Issue #4867](https://github.com/lampepfl/dotty/issues/4867) for further discussions.
64+
65+
### Example
66+
67+
```scala
68+
import scala.collection.mutable.ListBuffer
69+
val x = ListBuffer(Right("foo"), Left(0))
70+
val y: ListBuffer[Either[Int, String]] = x
71+
```
72+
73+
This code typechecks because the inferred type argument to `ListBuffer` in the
74+
right-hand side of `x` was `Left[Int, Nothing] | Right[Nothing, String]` which
75+
was widened to `Either[Int, String]`. If the compiler hadn't done this widening,
76+
the last line wouldn't typecheck because `ListBuffer` is invariant in its
77+
argument.

0 commit comments

Comments
 (0)