|
| 1 | +--- |
| 2 | +layout: blog-detail |
| 3 | +post-type: blog |
| 4 | +by: Martin Odersky and Nicolas Stucki |
| 5 | +title: "Macros: the Plan for Scala 3" |
| 6 | +--- |
| 7 | + |
| 8 | +## Or: Scala in a (Tasty) Nutshell |
| 9 | + |
| 10 | +One of the biggest open questions for migrating to [Scala 3](https://www.scala-lang.org/blog/2018/04/19/scala-3.html) is what to |
| 11 | +do about macros. In this blog post we give our current thoughts. The |
| 12 | +gist is that we are trying to achieve full alignment between macros |
| 13 | +and Tasty. |
| 14 | + |
| 15 | +## What is Tasty? |
| 16 | + |
| 17 | +Tasty is the high-level interchange format for Scala 3. It is based on |
| 18 | +<i>t</i>yped <i>a</i>bstract <i>s</i>yntax <i>t</i>rees. These trees |
| 19 | +contain in a sense all the information present in a Scala |
| 20 | +program. They represent the syntactic structure of programs and also |
| 21 | +contain the complete information about types and positions. The Tasty |
| 22 | +"snapshot" of a code file is taken after type checking (so that all |
| 23 | +types are present and all implicits are elaborated) but before any |
| 24 | +transformations (so that no information is lost or changed). The file |
| 25 | +representation of these trees is heavily optimized for compactness, |
| 26 | +which means that we can generate full Tasty trees on every compiler |
| 27 | +run and rely on nothing else for supporting separate compilation. |
| 28 | + |
| 29 | +The information present in Tasty trees can be used for many purposes. |
| 30 | + |
| 31 | + - The compiler uses it to support separate compilation. |
| 32 | + - Our [LSP-based language server](http://dotty.epfl.ch/docs/usage/ide-support.html) uses it to support hyperlinking, command completion, documentation, |
| 33 | + and also for global operations such as find-references and renaming. |
| 34 | + - A build tool can use it to cross-build on different platforms and migrate code from one binary |
| 35 | + version to another. |
| 36 | + - Optimizers and analyzers can use it for deep code analysis and advanced code generation |
| 37 | + |
| 38 | +Among these use cases, the first two work today. The other two are |
| 39 | +very interesting possibilities to pursue in the future. |
| 40 | + |
| 41 | +OK, but what is Tasty _exactly_? An up-to-date version of the Tasty |
| 42 | +file format is described in file |
| 43 | +[TastyFormat.scala](https://github.com/lampepfl/dotty/blob/master/compiler/src/dotty/tools/dotc/core/tasty/TastyFormat.scala) |
| 44 | +of the `dotc` compiler for Scala 3. |
| 45 | + |
| 46 | +## What Does Tasty Have to Do with Macros? |
| 47 | + |
| 48 | +It turns out that Tasty also makes an excellent foundation for a new |
| 49 | +generation of reflection-based macros, with the potential to solve |
| 50 | +many of the problems in the current version. |
| 51 | + |
| 52 | +The first problem with the current [Def Macros](https://docs.scala-lang.org/overviews/macros/overview.html) is that they |
| 53 | +are completely dependent on the current Scala compiler (internally |
| 54 | +named `nsc`). In fact, def macros are nothing but a thin |
| 55 | +veneer on top of `nsc` internals. This makes them very powerful but |
| 56 | +also fragile and hard to use. Because of this, they have had |
| 57 | +"experimental" status for their whole lifetime. Since Scala 3 uses a |
| 58 | +different compiler (`dotc`), the old reflect-based macro system cannot |
| 59 | +be ported to it, so we need something different, and hopefully better. |
| 60 | + |
| 61 | +Another criticism of the current macros is that they |
| 62 | +lack _foundations_. Scala 3 has already a meta |
| 63 | +programming facility, with particularly well explored foundations. [Principled Meta |
| 64 | +Programming](http://dotty.epfl.ch/docs/reference/principled-meta-programming.html) |
| 65 | +is a way to support _staging_ (in the sense of runtime code-generation) |
| 66 | +by adding just two operators to the |
| 67 | +language: Quote (`'`) to represent code expressions, and splice (`~`) |
| 68 | +to insert one piece of code in another. The inspiration for our |
| 69 | +approach [comes from temporal logic](https://dl.acm.org/citation.cfm?id=3011069). A |
| 70 | +somewhat related system is used for staging in |
| 71 | +[MetaOCaml](http://okmij.org/ftp/ML/MetaOCaml.html). We obtain a very |
| 72 | +high level _macro system_ by combining the two temporal operators `'` |
| 73 | +and `~` with Scala 3's `inline` feature. In a nutshell: |
| 74 | + |
| 75 | + - `inline` copies code from definition site to call site |
| 76 | + - `(')` turns code into syntax trees |
| 77 | + - `(~)` embeds syntax trees in other code. |
| 78 | + |
| 79 | +This approach to macros is very elegant, and has surprising expressive |
| 80 | +power. But it might be a little bit too principled. There are still |
| 81 | +many bread and butter tasks one cannot do with it. In particular: |
| 82 | + |
| 83 | + - Syntax trees are opaque, we are missing a way to decompose them and analyze their structure and contents. |
| 84 | + - We can only quote and splice expressions, but not other program structures such as definitions or parameters. |
| 85 | + |
| 86 | +We were looking for a long time for ways to augment principled meta |
| 87 | +programming by ways to decompose and flexibly reconstruct trees. The |
| 88 | +main problem here is choice paralysis - there is basically an infinite |
| 89 | +number of ways to expose the underlying structure. Quasi-quotes or |
| 90 | +syntax trees? Which constructs should be exposed exactly? What are the |
| 91 | +auxiliary types and operations? |
| 92 | + |
| 93 | +If we make some choice here, how do we know that this will be the |
| 94 | +right choice for users today? How to guarantee stability of the APIs |
| 95 | +in the future? This embarrassment of riches was essentially what |
| 96 | +plagued def macros. To solve this dilemma, we plan to go |
| 97 | +"bottom-up" instead of "top-down". We establish the following |
| 98 | +principle: |
| 99 | + |
| 100 | + _The reflective layer of macros will be isomorphic to Tasty._ |
| 101 | + |
| 102 | +This has several benefits: |
| 103 | + |
| 104 | + - **Completeness**. Tasty is Scala 3's interchange format, so basing the reflection API on it means no information is lost. |
| 105 | + - **Stability**. As an interchange format, Tasty will be kept stable. Its evolution will be carefully managed with a strict versioning system. So the reflection API can be evolved in a controlled way. |
| 106 | + - **Compiler Independence**. Tasty is designed to be independent of the actual Scala compilers supporting it. Besides the Dotty implementation there is now also a proof-of-concept system that shows that `scalac` can generate Tasty trees, and it is even conceivable to generate them from Java. This means that the reflection API can be easily ported to new compilers. If a compiler supports Tasty as the interchange format, it can be made to support the reflection API at the same time. |
| 107 | + |
| 108 | +## Scala in a Nutshell |
| 109 | + |
| 110 | +As a first step towards this goal, we are working on a representation |
| 111 | +of Tasty in terms of a suite of compiler-independent data |
| 112 | +structures. The [current |
| 113 | +status](https://github.com/lampepfl/dotty/blob/master/tests/pos/tasty/definitions.scala) |
| 114 | +gives high-level data structures for all aspects of a Tasty file. With |
| 115 | +currently about 200 lines of data definitions it reflects every piece of |
| 116 | +information that is contained in a Scala program after type |
| 117 | +checking. 200 lines is larger than a definition of mini-Lisp, but |
| 118 | +much, much smaller than the 30'000 lines or so of a full-blown |
| 119 | +compiler frontend! |
| 120 | + |
| 121 | +## Next Steps |
| 122 | + |
| 123 | +The next step, [currently under way](https://github.com/lampepfl/dotty/pull/4279), is to connect these definitions to the Tasty file format. We do this by rewriting them as |
| 124 | +[extractors](https://docs.scala-lang.org/tour/extractor-objects.html) |
| 125 | +that implement each data type in terms of the data structures used by |
| 126 | +the `dotc` compiler which are then pickled and unpickled in the Tasty |
| 127 | +file format. An interesting alternative would be to write Tasty |
| 128 | +picklers and unpicklers that work directly with reflect trees. |
| 129 | + |
| 130 | +Once this is done, we need to define and implement semantic operations such as |
| 131 | + |
| 132 | + - what are the members that can be selected on this expression? |
| 133 | + - which subclasses are defined for a sealed trait? |
| 134 | + - does this expression conform to some expected type? |
| 135 | + |
| 136 | +Finally, we need to connect the new lower-level reflection layer to the existing |
| 137 | +principled macro system based on quotes and splices. This looks not very difficult. In essence, we |
| 138 | +need to define a pair of mappings between high level trees of type |
| 139 | +`scala.quoted.Expr[T]` and lower-level Tasty trees of type |
| 140 | +`tasty.Term`. Mapping a high-level tree to a low-level one simply |
| 141 | +means exposing its structure. Mapping a a low-level tree to a |
| 142 | +high-level tree of type `scala.quoted.Expr[T]` means checking that the |
| 143 | +low-level tree has indeed the given type `T`. That should be all. |
| 144 | + |
| 145 | +## Future Macros |
| 146 | + |
| 147 | +If this scheme is adopted, it determines to a large degree what Scala 3 macros will |
| 148 | +look like. Most importantly, they will run after the typechecking phase is |
| 149 | +finished because that is when Tasty trees are generated and |
| 150 | +consumed. Running macro-expansion after typechecking has many advantages |
| 151 | + |
| 152 | + - it is safer and more robust, since everything is fully typed, |
| 153 | + - it does not affect IDEs, which only run the compiler until typechecking is done, |
| 154 | + - it offers more potential for incremental compilation and parallelization. |
| 155 | + |
| 156 | +But the scheme also restricts the kind of macros that can be expressed: |
| 157 | +macros will be [blackbox](https://docs.scala-lang.org/overviews/macros/blackbox-whitebox.html). |
| 158 | +This means that a macro expansion |
| 159 | +cannot influence the type of the expanded expression as seen from the |
| 160 | +typechecker. As long as that constraint is satisfied, we should be able |
| 161 | +to support both classical def macros and macro annotations. |
| 162 | + |
| 163 | +For instance, one will be able to define a macro annotation `@json` that adds a |
| 164 | +JSON serializer to a type. The difference with respect to |
| 165 | +today's [macro paradise](https://docs.scala-lang.org/overviews/macros/paradise.html) |
| 166 | +[annotation macros](https://docs.scala-lang.org/overviews/macros/annotations.html) |
| 167 | +(which are currently not part of the official Scala distribution) is that in Scala 3 |
| 168 | +the generated serializers can be seen only in downstream projects, because the expansion |
| 169 | +driven by the annotation happens after type checking. |
| 170 | + |
| 171 | +We believe the lack of whitebox macros can be alleviated to some degree by having |
| 172 | +more expressive forms of computed types. A sketch of such as system is outlined |
| 173 | +in [Dotty PR 3844](https://github.com/lampepfl/dotty/pull/3844). |
| 174 | + |
| 175 | +The Scala 3 language will also directly incorporate some constructs |
| 176 | +that so far required advanced macro code to define. In particular: |
| 177 | + |
| 178 | +- We model lazy implicits directly using |
| 179 | +[by-name parameters](http://dotty.epfl.ch/docs/reference/implicit-by-name-parameters.html) instead of through a macro. |
| 180 | + |
| 181 | + - Native [type lambdas](http://dotty.epfl.ch/docs/reference/type-lambdas.html) reduce the need for [kind projector](https://github.com/non/kind-projector). |
| 182 | + |
| 183 | + - There will be a way to do typeclass derivation a la [Kittens](https://github.com/milessabin/kittens), [Magnolia](https://github.com/propensive/magnolia), or [scalaz-deriving](https://gitlab.com/fommil/scalaz-deriving) that does not need macros. We are currently evaluating the alternatives. The primary goal is to develop a scheme that is easy to use and that performs well at both compile- and run-time. A second goal is generality, as long as it does not conflict with the primary goal. |
| 184 | + |
| 185 | +## Please Give Us Your Feedback! |
| 186 | + |
| 187 | +What do you think of the macro roadmap? To discuss, there's a [thread](https://contributors.scala-lang.org/t/what-kinds-of-macros-should-scala-3-support/1850) on |
| 188 | +[Scala Contributors](https://contributors.scala-lang.org). Your feedback |
| 189 | +there will be very valuable. There is also lots of scope to shape the |
| 190 | +future by contributing to the development in the [Dotty](https://github.com/lampepfl/dotty) repo. |
| 191 | + |
0 commit comments