Skip to content

Commit a01d23f

Browse files
committed
Add some overview notes on the GADT implementation
Authored by Aleksander Boruch-Gruszecki. The intent here is to start to have some more detailed information on the implementation, even in an initially rough form that can be refined over time.
1 parent e2e2885 commit a01d23f

File tree

2 files changed

+206
-0
lines changed

2 files changed

+206
-0
lines changed

docs/_docs/internals/gadts.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
# GADTs - Broad overview
2+
3+
There are multiple levels to the implementation. They deal with slightly different problems. The most important levels are the following ones:
4+
5+
1. Figuring out which relationships are necessary (PatternTypeConstrainer)
6+
2. Breaking down the relationships (TypeComparer)
7+
3. Recording and bookkeeping of relationships (GadtConstraint)
8+
4. Looking up the information (TypeComparer, GadtConstraint)
9+
10+
Out of the levels above, 4. is by far the simplest one.
11+
12+
There are also other parts to supporting GADTs. Roughly in order of importance, they are:
13+
14+
1. At specific points, abstract types are added to GadtConstraint. For instance, when entering a method, we add all type parameters to GadtConstraint.
15+
2. Casts need to be inserted when GADTs were used for type comparison.
16+
1. `TypeComparer` keeps track of whether a GADT constraint was used in a mutable variable `usedGadt`.
17+
3. GADT constraints need to be stored in attachments to be used in PostTyper.
18+
1. Attachment key is named `inferredGadtConstraints`.
19+
4. When we select members on a type that may have GADT constraints, we perform special "healing" by approximating the type using those constraints. We cannot take the constraints into account because member lookup is cached, and GADT constraints are only valid for specific scopes.
20+
21+
# Useful widgets
22+
23+
## Expr
24+
25+
This is the classical GADT example:
26+
27+
```scala
28+
enum Expr[T] {
29+
case IntLit(value: Int) extends Expr[Int]
30+
case BoolLit(value: Boolean) extends Expr[Boolean]
31+
case IfExpr(
32+
cond: Expr[Boolean],
33+
when: Expr[T],
34+
otherwise: Expr[T],
35+
)
36+
}
37+
```
38+
39+
## EQ
40+
41+
The following enum will result in an equality constraint between `S` and `T` if we match on it:
42+
43+
```scala
44+
enum EQ[S, T] {
45+
case Refl[U]() extends EQ[U, U]
46+
}
47+
```
48+
49+
## SUB
50+
51+
The following enum will result in a subtyping constraint `S <: T` if we match on it:
52+
53+
```scala
54+
enum SUB[-S, +T] {
55+
case Refl[U]() extends SUB[U, U]
56+
}
57+
```
58+
59+
# Details of above
60+
61+
## What abstract types can have GADT constraints
62+
63+
Right now, we record GADT constraints for:
64+
65+
- function/method type parameters
66+
- class type parameters
67+
68+
There is a branch on the way which will also record them for type members (so path-dependent types) and singleton types. It has a paper associated: "Implementing path-depepdent GADTs for Scala 3".
69+
70+
## What are necessary relationships? Any examples?
71+
72+
### Covariance means no constraint is necessary
73+
74+
Standard (non-case) classes allow "strange" inheritance which means that we cannot infer any information from covariant type parameters.
75+
76+
```scala
77+
class Expr[+A]
78+
class IntList extends Expr[List[Int]]
79+
80+
def foo[T](e: Expr[List[T]]): T =
81+
e match {
82+
case _ : IntList =>
83+
// e : Expr[List[Int]]
84+
// T <: Int
85+
0
86+
}
87+
88+
class Weird(list: List[String]) extends IntList with Expr[Nothing]
89+
```
90+
91+
Case classes have a special check which disallows inheritance like `Weird`. This means we can infer extra information from them.
92+
93+
## Breaking down the constraints
94+
95+
```scala
96+
class Expr[A]
97+
class IntList extends Expr[List[Int]]
98+
99+
def foo[T](e: Expr[List[T]]): T =
100+
e match {
101+
case _ : IntList =>
102+
// Level 1:
103+
// We start with e : Expr[List[T]]
104+
// We check that e : IntList <: Expr[List[Int]
105+
// Expr is invariant,
106+
// so we have List[Int] <: List[T] , List[T] <: List[Int]
107+
// Level 2:
108+
// We compare List[Int] <: List[T]
109+
// We record Int <: T
110+
// We compare List[T] <: List[Int]
111+
// We record T <: Int
112+
0
113+
}
114+
```
115+
116+
## Relation betweeen GadtConstraint and OrderingConstraint
117+
118+
### Internal and external types
119+
120+
GadtConstraint uses OrderingConstraint as the datastructure to record information about GADT constraints.
121+
122+
OrderingConstraint only supports working with TypeParamRefs.
123+
124+
GadtConstraint wants to record information about things other than TypeParamRefs.
125+
126+
To solve this, GadtConstraint internally creates TypeParamRefs which it adds to the OrderingConstraint it keeps internally. Whenever a GADT constraint is added, we "internalize" the type by replacing all the external types with the internal TypeParamRefs. Whenever we take bound information out of the GADT constraint, we need to "externalize" the types by replacing the internal TypeParamRefs with their external versions. To implement this, GadtConstraint keeps a bidirectional mapping between the external types and the internal TypeParamRefs.
127+
128+
The TypeParamRefs and TypeVars registered in one constraint cannot ever be present in types mentioned in the other type constraint. The internal TypeParamRefs and TypeVars cannot ever leak out of the GadtConstraint. We cannot ever record a bound in GadtConstraint which mentions TypeParamRefs used for type inference. (That part is ensured by the way TypeComparer is organised &#x2013; we will always try to record bounds in the "normal" constraint before recording a GADT bound.)
129+
130+
# Other details
131+
132+
## TypeComparer approximations
133+
134+
TypeComparer sometimes approximates the types it compares. Let's see an example based on these definitions:
135+
136+
```scala
137+
class Expr[T]
138+
class IntList extends Expr[Int]
139+
```
140+
141+
when comparing if `IntList <: Expr[Int]`, `TypeComparer` will approximate `IntList` to `Expr[Int]`. Then it will compare `Expr[Int] <: Expr[Int]` with appropriate variables set.
142+
143+
The variables which TypeComparer sets are `approxState` and `frozenGadt`.
144+
145+
## Necessary/sufficient either
146+
147+
TypeComparer sometimes needs to approximate some constraints, specifically when dealing with intersection and union types. The way this approximation works changes if we're currently inferring GADT constraints. This is hopefully documented well in TypeComparer in doc comments for `necessaryEither` and `sufficientEither`.
148+
149+
## Types bound in patterns
150+
151+
```scala
152+
(list : List[Int]) match {
153+
case lst : List[a] =>
154+
// a is a new type bound in the pattern.
155+
// We do not record any information about a.
156+
// We should know that a <: Int.
157+
// (Or it's fine to just have a =:= Int.)
158+
// We would not have this issue if we used a custom unapply.
159+
// Type case patterns create a fresh symbol even if they shouldn't.
160+
// (See indexPattern in Typer.)
161+
}
162+
```
163+
164+
## Internal structure of OrderingConstraint
165+
166+
Imagine we have two type parameters in scope, `A` and `B`.
167+
168+
We could record the following in `ctx.gadt`:
169+
170+
```text
171+
A <: Int
172+
B <: A
173+
```
174+
175+
Then, we expect that calling ``ctx.gadt.bounds(`B`)`` will return `` `<: Int` ``.
176+
177+
In order to handle this, `GadtConstraint` relies on `OrderingConstraint`. Internally, it will represent the above constraints as follows:
178+
179+
```text
180+
A <: Int
181+
B <: Int
182+
B <: A
183+
```
184+
185+
The first two constraints are "entries" &#x2013; they are easy to look up whenever we ask for bounds of `A` or `B`. The third constraint is an ordering &#x2013; it helps with correctly propagating the bounds we record.
186+
187+
# Possible broad improvements
188+
189+
## Allow OrderingConstraint to record bounds for things other than TypeParamRefs
190+
191+
This would mean we no longer need to keep the bidirectional mapping in GadtConstraint.
192+
193+
## Not mixing OrderingConstraint and ConstraintHandling in GadtConstraint
194+
195+
GadtConstraint right now mixes OrderingConstraint and ConstraintHandling. The first one is supposed to be the immutable constraint datastructure. The second one implements mutable functionality around a variable containing the immutable datastructure.
196+
197+
GadtConstraint mixes them both. Things would be better organised if GadtConstraint was split like the normal constraint.
198+
199+
## Creating a separate TypeComparer for breaking down types into GADT constraints
200+
201+
TypeComparer is biased towards one specific way of approximating constraints. When we infer types, it's ok to be "optimistic". When inferring GADT constraints, we should be as pessimistic as possible, in order to only infer constraints which are necessary.
202+
203+
TypeComparer was originally written to support type inference and GADTs were only added later on. This means that the "default" way TypeComparer approximates wasn't even noticeable before, but when inferring GADT constraints could result in inferring unsound information.
204+
205+
We could potentially fix this by creating a special TypeComparer which would *only* "break down" subtype relationships to record GADT constraints.

docs/sidebar.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,5 +193,6 @@ subsection:
193193
- page: internals/type-system.md
194194
- page: internals/dotty-internals-1-notes.md
195195
- page: internals/debug-macros.md
196+
- page: internals/gadts.md
196197
- page: release-notes-0.1.2.md
197198
hidden: true

0 commit comments

Comments
 (0)