Skip to content

Commit b0328fd

Browse files
authored
Document how mustachio's AOT compiler works (dart-lang#3886)
1 parent 4c82b49 commit b0328fd

File tree

2 files changed

+338
-12
lines changed

2 files changed

+338
-12
lines changed

tool/mustachio/README.md

Lines changed: 320 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -760,8 +760,8 @@ featured post's `title`.
760760

761761
### Rendering a partial
762762

763-
Partials are allowed to reference themselves, so they must be implemented as new
764-
functions which can reference themselves. This template code:
763+
Partials are allowed to reference themselves, so they must be implemented as
764+
separate functions which can call themselves recursively. This template code:
765765

766766
```html
767767
{{ #posts }}{{ >post }}{{ /posts }}
@@ -803,6 +803,322 @@ separate parameter, so that they are easily accessed by name. `context1` is
803803
accessed in order to write the post's `title`, and `context0` is accessed in
804804
order to write the author's `name`.
805805

806-
### High level design for generating renderers
806+
### Compiler for generating renderers
807+
808+
The AOT compiler is a tool that builds render functions from Mustache templates.
809+
In order to understand the types of Mustache keys encounted in the templates,
810+
the compiler must also know the singular static context type that will be
811+
"rendered into" each template.
812+
813+
The AOT compiler only needs to be executed by a Dartdoc developer, when a
814+
template changes, or when any one of the types that may be rendered into a
815+
template changes, or when the complier changes. The generated renderer functions
816+
are checked in as Dartdoc source code. In other words, the ahead-of-time
817+
compiled renderer functions only need to be compiled when making a change to
818+
Dartdoc. These renderer functions, on the other hand, need to run every single
819+
time Dartdoc runs, generating HTML documentation. Therefore we generally aim to
820+
remove complexity from the renderer functions, even at the cost of added
821+
complexity in the AOT compiler.
822+
823+
#### Basic example
824+
825+
As a basic example of how the compiler chooses what to write into a renderer
826+
function, see the code below. The User class is rendered into the `user.html`
827+
template, as specified in this `@Renderer` annotation:
828+
829+
```dart
830+
@Renderer(#renderUser, Context<User>(), 'user')
831+
```
832+
833+
```dart
834+
abstract class User {
835+
String get name;
836+
Post? get featuredPost;
837+
List<Post> get posts;
838+
}
839+
```
840+
841+
```html
842+
<h1>{{ name }}</h1>
843+
{{ #featuredPost }}{{ >post }}{{ /featuredPost }}
844+
```
845+
846+
The AOT compiler takes the parsed Mustache template, which contains a rendered
847+
variable (`{{ name }}`) and a section (`{{ #featuredPost }}...`).
848+
849+
The first step is to write the function name and parameters. The `@Renderer`
850+
annotation specifies that the public name for the renderer function is
851+
`renderUser`. As a top-level, public render function, there is only one context
852+
variable in the context stack, which is `User`. The only parameter therefore is
853+
`User context0`:
854+
855+
```dart
856+
String renderUser(User context0) {
857+
final buffer = StringBuffer();
858+
// ...
859+
return buffer.toString();
860+
}
861+
```
862+
863+
The compiler looks up the `name` property on `User`, finds that it exists, and
864+
returns a `String`, which is valid for a rendered variable. When generating the
865+
renderer, the compiler can just write to the function's `buffer`.
866+
867+
The compiler then looks up the `featuredPost` property on `User`, finds that it
868+
exists, and returns a nullable `Post`. This means the section is a "value"
869+
section; the compiler writes the renderer to only write to `buffer` if
870+
`context0.featuredPost` is non-`null`. If instead the compiler were to see that
871+
`featuredPost` were a `bool`-typed property, it would write the renderer to
872+
write the section content depending on whether the property is `true` or
873+
`false`. And finally if instead the compiler were to see that `featuredPost`
874+
were an `Iterable`-typed property, it would write the renderer to loop over the
875+
value of the property and write the section repeatedly.
876+
877+
#### Partials
878+
879+
Most of the complexity in the AOT compiler is found in the handling of partials.
880+
The compiler attempts to generate a minimal amount of code for the renderer
881+
functions.
882+
883+
Each partial template is compiled into it's own (private) renderer function,
884+
complete with a name, a list of parameters, and a body. They must be very
885+
flexible in order to satisfy a variety of legal situations allowed by the
886+
Mustache template system:
887+
888+
1. Just as with a top-level template, and as with a section, a partial has
889+
access to the entire context stack.
890+
891+
As a quick example, if a reference to a partial is a point in a template with
892+
3 context variables, then the partial must also have access to those 3
893+
context variables; it will have 3 parameters (modulo the optimizations
894+
below).
895+
896+
2. A partial can reference itself. For this reason, partials are compiled into
897+
their own named functions.
898+
899+
3. A single partial can be referenced by multiple templates, and the context
900+
stacks of these templates may be completely different from each other.
901+
902+
For example two templates may reference one partial, and one may have as the
903+
top context variable a `String`, while the other may have as the top context
904+
variable a `List<int>`. The partial may then contain a rendered variable for
905+
a property named `length`; this is all legal. Therefore, at the outset, it
906+
looks like each _reference_ to a partial, even the same partial, requires
907+
generating a separate renderer function. In this example, one partial
908+
renderer function will take a `String` parameter, and the other will take a
909+
`List<int>` parameter.
910+
911+
(In practice, while a given partial template may be referenced by multiple
912+
templates with different context stacks, the types of corresponding context
913+
variables will typically have LUB types that are more narrow than `Object`
914+
and that can be legally used as parameter types. This allows for
915+
deduplication, and is described below.)
916+
917+
4. A partial may be referenced multiple times from the same template. Again, the
918+
points at which these references occur may have differing context stacks.
919+
This is just another reason that each reference to a partial _may_ require
920+
generating a separate renderer function.
921+
922+
Because we may need to generate a partial function for each _reference_ to a
923+
partial template, they are uniquely named with their call stack. For example, if
924+
the `renderUser` function references the `_post` partial, then the generated
925+
renderer function for that partial is called `_renderUser_partial_post_0`. If it
926+
references that partial twice, the second rendered function is called
927+
`_renderUser_partial_post_1`. If one of these partials references the `_author`
928+
partial, the generated rendered function for that partial is called
929+
`_renderUser_partial_post_0_partial_author_0`. One can see how this can quickly
930+
get out-of-hand, and how this system can really benefit from some optimizations.
931+
932+
#### High level code walkthrough
933+
934+
The AOT compiler is found in `tool/mustachio/codegen_aot_compiler.dart`. The
935+
entrypoint into this code is the top-level `compileTemplatesToRenderers`
936+
function. This function takes a set of `RendererSpec`s (just the info derived
937+
from each `@Renderer` annotation) and returns a single String, the source text
938+
for a Dart library containing all of the compiled renderer functions.
939+
940+
The `compileTemplatesToRenderers` function is fairly simple; it walks over the
941+
`RendererSpec` objects, creating an `_AotCompiler` object for each. The
942+
`_AotCompiler._readAndParse` function takes a context type, a renderer name, a
943+
path to a template, and some extra data, parses the template, and returns an
944+
`_AotCompiler` instance. The `compileTemplatesToRenderers` function then takes
945+
that compiler instance, compiles the template into a renderer function (a String
946+
of Dart source code), and also collects a mapping of partial renderer functions
947+
that were compiled in the process. When the compiler instance compiles its given
948+
template into a renderer, it recursvely creates a compiler instance for each
949+
referenced partial and compiles the reference partial into a renderer function
950+
(see `_BlockCompiler._compilePartial`).
951+
952+
In this way, `compileTemplatesToRenderers` collects all of the compiler
953+
instances and the renderer function source code that has been compiled by each.
954+
Finally, it writes out all of the function source code to one giant
955+
StringBuffer; some import directives are prepended, and everything is ultimately
956+
written to a single file on disk.
957+
958+
We track the mapping of each compiler to the source code it compiled, in order
959+
to perform some optimizations before the final list of renderer functions is
960+
written to the StringBuffer. These are detailed below.
961+
962+
#### Used context stacks
963+
964+
The first optimization in Mustachio's partial renderer function generation is to
965+
strip out unused context stacks.
966+
967+
For example, take the following template and partial:
968+
969+
```html
970+
<!-- home template -->
971+
{{ #loggedInUser }}
972+
{{ #featuredPost }}
973+
{{ #authors }}{{ >author }}{{ /authors }}
974+
{{ /featuredPost }}
975+
{{ /loggedInUser }}
976+
977+
<!-- _author partial -->
978+
{{ name }}
979+
```
807980

808-
TODO(srawlins): Write.
981+
Let's say that some generic `HomePageData` object is rendered into this
982+
template; the `loggedInUser` property has a `User` type; `featuredPost` is a
983+
property on `User`, with a `Post` type; `authors` is a property on `Post` with a
984+
`List<User>`. The `_author` partial template can legally access any property on
985+
the context stack: `User`, `Post`, `User`, `HomePageData`. As per the rules of
986+
Mustache, a renderer must first search the top context type, `User`, for a
987+
property named `name`, and if that is not found, continue down the context
988+
stack.
989+
990+
Without any further investigation, it looks like the renderer function for the
991+
`_author` partial will have 4 parameters, `User context0`, `Post context1`,
992+
`User context2`, and `HomePageData context3`. However, as we know the entire
993+
parsed contents of the partial, we can simplify the list of parameters down to
994+
the ones which are actually _used_.
995+
996+
(The attentive reader will note that right off the bat, if `name` is not found
997+
on the first context variable, a `User`-typed variable, then it's not going to
998+
be found on the third context variable, also a `User`, so we can immediately
999+
strip out the 3rd parameter; this behavior comes out of the broader optimization
1000+
as well.)
1001+
1002+
In order to reduce the `_author` renderer function's parameters down to the ones
1003+
which are used, we must walk the parsed partial and track the variables on the
1004+
context stack which are used in order to access a variable or a section key. In
1005+
this example where `name` is the only property accessed, and where `name` is a
1006+
property on `User`, we can reduce the number of parameters from 4 down to 1.
1007+
1008+
Note that the `_author` partial template may itself reference other templates.
1009+
If it refers to an `_avatar` partial, and a `_badges` partial, then each of
1010+
those partials _can also legally access_ any variable in the context stack. So
1011+
when walking the parsed `_author` partial, tracking the used variables, we must
1012+
take `_avatar` and `_badges` into account, walking those partials, etc.
1013+
1014+
In practice this can immensely simplify the generated renderers as the vast
1015+
majority of rendered variables and section keys are properties on the top-most
1016+
context variable. This means reducing the number of parameters that each
1017+
renderer function takes and reducing the number of arguments that each renderer
1018+
function needs to pass to partial calls.
1019+
1020+
In the `codegen_aot_compiler.dart` source, here are the steps that carry out
1021+
this optimization:
1022+
1023+
1. The `_AotCompiler._compileToRenderer` function creates a `_BlockCompiler` (a
1024+
class that compiles a single Mustache block into a String) with the current
1025+
context stack, in order to compile the Mustache block that is the top-level
1026+
unit of a template.
1027+
2. The `_BlockCompiler` compiles the block of Mustache into a series of Dart
1028+
statements (as source code), and tracks the referenced context variables in a
1029+
set, `_BlockCompiler._usedContextTypes`.
1030+
3. At this point we have the body of the renderer that we are creating, and its
1031+
name. We write the return type (`String`) and the nameo of the render
1032+
function, and then must write the list of parameters. Instead of writing the
1033+
list of _all_ of the context variables as parameters, we only write the
1034+
_used_ ones, collected up by the `_BlockCompiler` (and any nested
1035+
`_AotCompiler`s and `_BlockCompiler`s that were also created).
1036+
4. (Sometimes type parameters must also be added to the render functions, and
1037+
sometime type arguments must also be added to the parameter types; this is
1038+
omitted here.)
1039+
5. After writing the parameters, we can write the body, and we're done.
1040+
1041+
Note that there is a shortcoming of this implementation in the names of the
1042+
parameters of a partial renderer function. A given `_BlockCompiler` has a
1043+
context stack and a template. The context stack is a list of "variable lookup"
1044+
objects, which each describe a contect variable's type and name. So before the
1045+
block compiler knows what the used context variables are, the names of all
1046+
context variables is hard-coded. The block compiler then generates statements
1047+
for the body of the function, using those variable names. Because of this
1048+
implementation, some partial renderer functions are created with a seemingly
1049+
arbitrary list of parameter names. For a given partial, maybe the 1st and 3rd
1050+
parameters (`context0` and `context2`) in the context stack are unused, and so
1051+
the two parameters left that the function is written to accept are called
1052+
`context1` and `context3`.
1053+
1054+
#### Deduplicating partials
1055+
1056+
The second optimization the AOT compiler makes is to deduplicate the partial
1057+
renderer functions. Generating an entire set of partial functions for every call
1058+
stack of each reference to each partial yields a lot of code. In most cases of
1059+
real Mustache templates, simplification is possible.
1060+
1061+
The idea is based on the Least Upper Bound (LUB) of Dart types. If we generate 3
1062+
renderer functions for a partial template, that each have a context stack with 2
1063+
context variables, we might be able to replace the 3 functions with a new
1064+
function that uses slightly different context stack types. In particular, it is
1065+
often the case that one template refers to a partial with type `A` as the
1066+
topmost context type, and that another template refers to the same partial with
1067+
type `B` as the topmost context type, and that `A` and `B` are closely related
1068+
(for example they share the same base class, which is not `Object`, or one is a
1069+
supertype of the other). So we can often get away with calculating the Least
1070+
Upper Bound of pairwise items in each context stack, creating a new context
1071+
stack. If the context stacks of our 3 renderer functions have types `T1, U1`,
1072+
`T2, U2`, and `T3, U3`, then we can create a new context stack with types
1073+
`LUB(T1, LUB(T2, T3)), LUB(U1, LUB(U2, U3))`. (Given an LUB function that can
1074+
take arbitrarily many types, this can be written `LUB(T1, ..., Tn)` for each
1075+
of `n` context types in the set of context stacks.)
1076+
1077+
Care must be taken however, as using an LUB type may escape beyond the static
1078+
type on which properties have been previously resolved. If the partial compiled
1079+
into the 3 renderer functions above refers to a property `foo`, and the LUB of
1080+
the individual types does not have any property `foo`, then the LUB type does
1081+
not work, and cannot be used. In practice though, this strategy allows us to
1082+
deduplicate many renderer functions for Dartdoc.
1083+
1084+
In the `codegen_aot_compiler.dart` source, here are the steps that carry out
1085+
this optimization:
1086+
1087+
1. After gathering the list of all `_AotCompiler` instances that each compiled a
1088+
renderer function (as Dart source code), we enter `_deduplicateRenderers` to
1089+
deduplicate the list.
1090+
2. This function first creates a new mapping that maps each partial's path to
1091+
the list of compilers that each compiled that partial to a renderer function,
1092+
and walks each entry in the map.
1093+
1. For each partial path and relevant list of compilers, we create a list of
1094+
the "used context stacks"; so the first item in this list is the used
1095+
context stack calculated by the first compiler, etc.
1096+
2. We then calculate the LUB of the types in each position in the list, with
1097+
the `contextStackLub` function. For example, if a list of used context
1098+
stacks has 3 context stacks (derived from 3 compilers), and each context
1099+
stack has 2 context variables, then the result is a context stack, again
1100+
with 2 context variables, such that the first context variable is the LUB
1101+
of the first variable in each of the 3 original context stacks, and the
1102+
second context variable is the LUB of the second variable in each of the 3
1103+
original context stacks. (If the context stacks in the list do not all
1104+
have exactly the same length, we say the "LUB context stack" is `null`,
1105+
and we cannot deduplicate the renderer functions.)
1106+
3. If the context stacks have some valid LUB context stack, then we may be
1107+
able to replace each renderer function that was compiled for this partial
1108+
with a single renderer function that uses the LUB context stack. We
1109+
proceed by creating a new `_AotCompiler` and a fresh, "deduplicated"
1110+
renderer name.
1111+
4. We try to compile the partial with the new deduplicated compiler. It is
1112+
possible that this fails: if the partial depended on properties that were
1113+
available on the individual context stacks, but are unavailable on the LUB
1114+
context stack, then compilation will fail. In this case, we can just keep
1115+
the individual renderer functions.
1116+
5. If the new deduplicated compiler successfully compiles a renderer
1117+
function, we move forward with it: for each replaced compiler, we replace
1118+
its renderer function with a "redirecting" renderer function, that simply
1119+
redirects to a call to the deduplicated renderer function.
1120+
6. In order to reduce the amount of generated code, we can also _remove_ any
1121+
partial renderer functions that were only referenced by _replaced_
1122+
partial renderer functions. This is calculated recursively.
1123+
3. Finally, the new mapping of compilers to compiled renderer functions is
1124+
passed back to the `compileTemplatesToRenderers` to be written out.

0 commit comments

Comments
 (0)