|
| 1 | +(The following is work in progress) |
| 2 | + |
| 3 | +## Symbols and SymDenotations |
| 4 | + |
| 5 | + - why symbols are not enough: their contents change all the time |
| 6 | + - they change themselvesSo a `Symbol |
| 7 | + |
| 8 | + - reference: string + sig |
| 9 | + |
| 10 | + |
| 11 | +Dotc is different from most other compilers in that it is centered around the idea of |
| 12 | +maintaining views of various artifacts associated with code. These views are indexed |
| 13 | +by tne |
| 14 | + |
| 15 | +A symbol refers to a definition in a source program. Traditionally, |
| 16 | + compilers store context-dependent data in a _symbol table_. The |
| 17 | + symbol then is the central reference to address context-dependent |
| 18 | + data. But for `dotc`'s requirements it turns out that symbols are |
| 19 | + both too little and too much for this task. |
| 20 | + |
| 21 | +Too little: The attributes of a symbol depend on the phase. Examples: |
| 22 | +Types are gradually simplified by several phases. Owners are changed |
| 23 | +in phases `LambdaLift` (when methods are lifted out to an enclosing |
| 24 | +class) and Flatten (when all classes are moved to top level). Names |
| 25 | +are changed when private members need to be accessed from outside |
| 26 | +their class (for instance from a nested class or a class implementing |
| 27 | +a trait). So a functional compiler, a `Symbol` by itself met mean |
| 28 | +much. Instead we are more interested in the attributes of a symbol at |
| 29 | +a given phase. |
| 30 | + |
| 31 | +`dotc` has a concept for "attributes of a symbol at |
| 32 | + |
| 33 | +Too much: If a symbol is used to refer to a definition in another |
| 34 | +compilation unit, we get problems for incremental recompilation. The |
| 35 | +unit containing the symbol might be changed and recompiled, which |
| 36 | +might mean that the definition referred to by the symbol is deleted or |
| 37 | +changed. This leads to the problem of stale symbols that refer to |
| 38 | +definitions that no longer exist in this form. `scalac` tried to |
| 39 | +address this problem by _rebinding_ symbols appearing in certain cross |
| 40 | +module references, but it turned out to be too difficult to do this |
| 41 | +reliably for all kinds of references. `dotc` attacks the problem at |
| 42 | +the root instead. The fundamental problem is that symbols are too |
| 43 | +specific to serve as a cross-module reference in a system with |
| 44 | +incremental compilation. They refer to a particular definition, but |
| 45 | +that definition may not persist unchanged after an edit. |
| 46 | + |
| 47 | +`dotc` uses instead a different approach: A cross module reference is |
| 48 | +always type, either a `TermRef` or ` TypeRef`. A reference type contains |
| 49 | +a prefix type and a name. The definition the type refers to is established |
| 50 | +dynamically based on these fields. |
| 51 | + |
| 52 | + |
| 53 | +a system where sources can be recompiled at any instance, |
| 54 | + |
| 55 | + the concept of a `Denotation`. |
| 56 | + |
| 57 | + Since definitions are transformed by phases, |
| 58 | + |
| 59 | + |
| 60 | +The [Dotty project](https://github.com/lampepfl/dotty) |
| 61 | +is a platform to develop new technology for Scala |
| 62 | +tooling and to try out concepts of future Scala language versions. |
| 63 | +Its compiler is a new design intended to reflect the |
| 64 | +lessons we learned from work with the Scala compiler. A clean redesign |
| 65 | +today will let us iterate faster with new ideas in the future. |
| 66 | + |
| 67 | +Today we reached an important milestone: The Dotty compiler can |
| 68 | +compile itself, and the compiled compiler can act as a drop-in for the |
| 69 | +original one. This is what one calls a *bootstrap*. |
| 70 | + |
| 71 | +## Why is this important? |
| 72 | + |
| 73 | +The main reason is that this gives us a some validation of the |
| 74 | +*trustworthiness* of the compiler itself. Compilers are complex beasts, |
| 75 | +and many things can go wrong. By far the worst things that can go |
| 76 | +wrong are bugs where incorrect code is produced. It's not fun debugging code that looks perfectly |
| 77 | +fine, yet gets translated to something subtly wrong by the compiler. |
| 78 | + |
| 79 | +Having the compiler compile itself is a good test to demonstrate that |
| 80 | +the generated code has reached a certain level of quality. Not only is |
| 81 | +a compiler a large program (44k lines in the case of dotty), it is |
| 82 | +also one that exercises a large part of the language in quite |
| 83 | +intricate ways. Moreover, bugs in the code of a compiler don't tend to |
| 84 | +go unnoticed, precisely because every part of a compiler feeds into |
| 85 | +other parts and all together are necessary to produce a correct |
| 86 | +translation. |
| 87 | + |
| 88 | +## Are We Done Yet? |
| 89 | + |
| 90 | +Far from it! The compiler is still very rough. A lot more work is |
| 91 | +needed to |
| 92 | + |
| 93 | + - make it more robust, in particular when analyzing incorrect programs, |
| 94 | + - improve error messages and warnings, |
| 95 | + - improve the efficiency of some of the generated code, |
| 96 | + - embed it in external tools such as sbt, REPL, IDEs, |
| 97 | + - remove restrictions on what Scala code can be compiled, |
| 98 | + - help in migrating Scala code that will have to be changed. |
| 99 | + |
| 100 | +## What Are the Next Steps? |
| 101 | + |
| 102 | +Over the coming weeks and months, we plan to work on the following topics: |
| 103 | + |
| 104 | + - Make snapshot releases. |
| 105 | + - Get the Scala standard library to compile. |
| 106 | + - Work on SBT integration of the compiler. |
| 107 | + - Work on IDE support. |
| 108 | + - Investigate the best way to obtaining a REPL. |
| 109 | + - Work on the build infrastructure. |
| 110 | + |
| 111 | +If you want to get your hands dirty with any of this, now is a good moment to get involved! |
| 112 | +To get started: <https://github.com/lampepfl/dotty>. |
| 113 | + |
0 commit comments