|
| 1 | +--- |
| 2 | +layout: doc-page |
| 3 | +title: "Explicit Nulls" |
| 4 | +--- |
| 5 | + |
| 6 | +The "explicit nulls" feature (enabled via a flag) changes the Scala type hierarchy |
| 7 | +so that reference types (e.g. `String`) are non-nullable. We can still express nullability |
| 8 | +with union types: e.g. `val x: String|Null = null`. |
| 9 | + |
| 10 | +The implementation of the feature in dotty can be conceptually divided in several parts: |
| 11 | + 1. changes to the type hierarchy so that `Null` is only a subtype of `Any` |
| 12 | + 2. a "translation layer" for Java interop that exposes the nullability in Java APIs |
| 13 | + 3. a "magic" `JavaNull` type (an alias for `Null`) that is recognized by the compiler and |
| 14 | + allows unsound member selections (trading soundness for usability) |
| 15 | + 4. a module for "flow typing", so we can work more naturally with nullable values |
| 16 | + |
| 17 | +Feature Flag |
| 18 | +------------ |
| 19 | +Explicit nulls are disabled by default. They can be enabled via `-Yexplicit-nulls` defined in |
| 20 | +`ScalaSettings.scala`. All of the explicit-nulls-related changes should be gated behind the flag. |
| 21 | + |
| 22 | +Type Hierarchy |
| 23 | +-------------- |
| 24 | +We change the type hierarchy so that `Null` is only a subtype of `Any` by: |
| 25 | + - modifying the notion of what is a nullable class (`isNullableClass`) in `SymDenotations` |
| 26 | + to include _only_ `Null` and `Any` |
| 27 | + - changing the parent of `Null` in `Definitions` to point to `Any` and not `AnyRef` |
| 28 | + - changing `isBottomType` and `isBottomClass` in `Definitions` |
| 29 | + |
| 30 | +Java Interop |
| 31 | +------------ |
| 32 | +The problem we're trying to solve here is: if we see a Java method `String foo(String)`, |
| 33 | +what should that method look like to Scala? |
| 34 | + - since we should be able to pass `null` into Java methods, the argument type should be `String|JavaNull` |
| 35 | + - since Java methods might return `null`, the return type should be `String|JavaNull` |
| 36 | + |
| 37 | +`JavaNull` here is a type alias for `Null` with "magic" properties (see below). |
| 38 | + |
| 39 | +At a high-level: |
| 40 | + - we track the loading of Java fields and methods as they're loaded by the compiler |
| 41 | + - we do this in two places: `Namer` (for Java sources) and `ClassFileParser` (for bytecode) |
| 42 | + - whenever we load a Java member, we "nullify" its argument and return types |
| 43 | + |
| 44 | +The nullification logic lives in `JavaNullInterop.scala`, a new file. |
| 45 | + |
| 46 | +The entry point is the function `def nullifyMember(sym: Symbol, tp: Type)(implicit ctx: Context): Type` |
| 47 | +which, given a symbol and its "regular" type, produces what the type of the symbol should be in the |
| 48 | +explicit nulls world. |
| 49 | + |
| 50 | +In order to nullify a member, we first pass it through a "whitelist" of symbols that need |
| 51 | +special handling (e.g. `constructors`, which never return `null`). If none of the "policies" in the |
| 52 | +whitelist apply, we then process the symbol with a `TypeMap` that implements the following nullification |
| 53 | +function `n`: |
| 54 | + 1. n(T) = T|JavaNull if T is a reference type |
| 55 | + 2. n(T) = T if T is a value type |
| 56 | + 3. n(C[T]) = C[T]|JavaNull if C is Java-defined |
| 57 | + 4. n(C[T]) = C[n(T)]|JavaNull if C is Scala-defined |
| 58 | + 5. n(A|B) = n(A)|n(B)|JavaNull |
| 59 | + 6. n(A&B) = n(A) & n(B) |
| 60 | + 7. n((A1, ..., Am)R) = (n(A1), ..., n(Am))n(R) for a method with arguments (A1, ..., Am) and return type R |
| 61 | + 8. n(T) = T otherwise |
| 62 | + |
| 63 | +JavaNull |
| 64 | +-------- |
| 65 | +`JavaNull` is just an alias for `Null`, but with magic power. `JavaNull`'s magic (anti-)power is that |
| 66 | +it's unsound. |
| 67 | + |
| 68 | +```scala |
| 69 | +val s: String|JavaNull = "hello" |
| 70 | +s.length // allowed, but might throw NPE |
| 71 | +``` |
| 72 | + |
| 73 | +`JavaNull` is defined as `JavaNullAlias` in `Definitions`. |
| 74 | +The logic to allow member selections is defined in `findMember` in `Types.scala`: |
| 75 | + - if we're finding a member in a type union |
| 76 | + - and the union contains `JavaNull` on the r.h.s. after normalization (see below) |
| 77 | + - then we can continue with `findMember` on the l.h.s of the union (as opposed to failing) |
| 78 | + |
| 79 | +Working with Nullable Unions |
| 80 | +---------------------------- |
| 81 | +Within `Types.scala`, we defined a few utility methods to work with nullable unions. All of these |
| 82 | +are methods of the `Type` class, so call them with `this` as a receiver: |
| 83 | + - `isNullableUnion` determines whether `this` is a nullable union. Here, what constitutes |
| 84 | + a nullable union is determined purely syntactically: |
| 85 | + 1. first we "normalize" `this` (see below) |
| 86 | + 2. if the result is of the form `T | Null`, then the type is considered a nullable union. |
| 87 | + Otherwise, it isn't. |
| 88 | + - `isJavaNullableUnion` determines whether `this` is syntactically a union of the form `T|JavaNull` |
| 89 | + - `normNullableUnion` normalizes `this` as follows: |
| 90 | + 1. if `this` is not a nullable union, it's returned unchanged. |
| 91 | + 2. if `this` is a union, then it's re-arranged so that all the `Null`s are to the right of all |
| 92 | + the non-`Null`s. |
| 93 | + - `stripNull` syntactically strips nullability from `this`: e.g. `String|Null => String`. Notice this |
| 94 | + works only at the "top level": e.g. if we have an `Array[String|Null]|Null` and we call `stripNull` |
| 95 | + we'll get `Array[String|Null]` (only the outermost nullable union was removed). |
| 96 | + - `stripAllJavaNull` is like `stripNull` but removes _all_ nullable unions in the type (and only works |
| 97 | + for `JavaNull`). This is needed when we want to "revert" the Java nullification function. |
| 98 | + |
| 99 | +Flow Typing |
| 100 | +----------- |
| 101 | +Flow typing is needed so we can work with nullable unions in a more natural way. |
| 102 | +The following is a common idiom that should work without additional casts: |
| 103 | +```scala |
| 104 | +val x: String|Null = ??? |
| 105 | +if (x != null && x.length < 10) |
| 106 | +``` |
| 107 | +This is implemented as a "must be null in the current scope" analysis on stable paths: |
| 108 | + - we add additional state to the `Context` in `Contexts.scala`. |
| 109 | + Specifically, we add a set of `FlowFacts` (right now just a set of `TermRef`s), which |
| 110 | + are the paths known to be non-nullable in the current scope. |
| 111 | + - the bulk of the flow typing logic lives in a new `FlowTyper.scala` file. |
| 112 | + |
| 113 | + There are four entry points to `FlowTyper`: |
| 114 | + 1. `inferFromCond(cond: Tree): Inferred`: given a tree representing a condition such as |
| 115 | + `x != null && x.length < 10`, return the `Inferred` facts. |
| 116 | + |
| 117 | + In turn, `Inferred` is defined as `case class Inferred(ifTrue: FlowFacts, ifFalse: FlowFacts)`. |
| 118 | + That is, `Inferred` contains the paths that _must_ be non-null if the condition is true and, |
| 119 | + separately, the paths that must be non-null if the condition is false. |
| 120 | + |
| 121 | + e.g. for `x != null` we'd get `Inferred({x}, {})`, but only if `x` is stable. |
| 122 | + However, if we had `x == null` we'd get `Inferred({}, {x})`. |
| 123 | + |
| 124 | + 2. `inferWithinCond(cond: Tree): FlowFacts`: given a condition of the form `lhs && rhs` or |
| 125 | + `lhs || rhs`, calculate the paths that must be non-null for the rhs to execute (given |
| 126 | + that these operations) are short-circuiting. |
| 127 | + |
| 128 | + 3. `inferWithinBlock(stat: Tree): FlowFacts`: if `stat` is a statement with a block, calculate |
| 129 | + which paths must be non-null when the statement that _follows_ `stat` in the block executes. |
| 130 | + This is so we can handle things like |
| 131 | + ```scala |
| 132 | + val x: String|Null = ??? |
| 133 | + if (x == null) return |
| 134 | + val y = x.length |
| 135 | + ``` |
| 136 | + Here, `inferWithinBlock(if (x == null) return)` gives back `{x}`, because we can tell that |
| 137 | + the next statement will execute only if `x` is non-null. |
| 138 | + |
| 139 | + 4. `refineType(tpe: Type): Type`: given a type, refine it if possible using flow-sensitive type |
| 140 | + information. This uses a `NonNullTermRef` (see below). |
| 141 | + |
| 142 | + - Each of the public APIs in `FlowTyper` is used to do flow typing in a different scenario |
| 143 | + (but all the use sites of `FlowTyper` are in `Typer.scala`): |
| 144 | + * `refineType` is used in `typedIdent` and `typedSelect` |
| 145 | + * `inferFromCond` is used for typing if statements |
| 146 | + * `inferWithinCond` is used when typing "applications" (which is how "&&" and "||" are encoded) |
| 147 | + * `inferWithinBlock` is used when typing blocks |
| 148 | + |
| 149 | + For example, to do FlowTyping on if expressions: |
| 150 | + * we type the condition |
| 151 | + * we give the typed condition to the FlowTyper and obtain a pair of sets of paths `(ifTrue, ifFalse)`. |
| 152 | + We type the `then` branch with the `ifTrue` facts, and the else branch with the `ifFalse` facts. |
| 153 | + * profit |
| 154 | + |
| 155 | +Flow typing also introduces two new abstractions: `NonNullTermRef` and `ValDefInBlockCompleter`. |
| 156 | + |
| 157 | +#### NonNullTermRef |
| 158 | +This is a new type of `TermRef` (path-dependent type) that, whenever its denotation is updated, makes sure |
| 159 | +that the underlying widened type is non-null. It's defined in `Types.scala`. A `NonNullTermRef` is identified by `computeDenot` whenever the denotation is updated, and then we call `stripNull` on the widened type. |
| 160 | + |
| 161 | +To use the flow-typing information, whenever we see a path that we know must be non-null (in `typedIdent` or |
| 162 | +`typedSelect`), we replace its `TermRef` by a `NonNullTermRef`. |
| 163 | + |
| 164 | +#### ValDefInBlockCompleter |
| 165 | +This a new type of completer defined in `Namer.scala` that completes itself using the completion context, asopposed to the creation context. |
| 166 | + |
| 167 | +The problem we're trying to solve here is the following: |
| 168 | +```scala |
| 169 | +val x: String|Null = ??? |
| 170 | +if (x == null) return |
| 171 | +val y = x.length |
| 172 | +``` |
| 173 | +The block is usually typed as follows: |
| 174 | + 1. first, we scan the block to create symbols for the new definitions (`val x`, `val y`) |
| 175 | + 2. then, we type statement by statement |
| 176 | + 3. the completers for the symbols created in 1. are _all_ invoked in step 2. However, |
| 177 | + regular completers use the _creation_ context, so that means that `val y` is completed |
| 178 | + with a context that doesn't contain the new flow fact "x != null". |
| 179 | + |
| 180 | +To fix this, whenever we're inside a block and we create completers for `val`s, we use a |
| 181 | +`ValDefInBlockCompleter` instead of a regular completer. This new completer uses the completion context, |
| 182 | +which is aware of the new flow fact "x != null". |
0 commit comments