Skip to content

Commit 7f2e2dc

Browse files
committed
Document how mustachio's AOT compiler works
1 parent 526dbd5 commit 7f2e2dc

File tree

2 files changed

+338
-12
lines changed

2 files changed

+338
-12
lines changed

tool/mustachio/README.md

Lines changed: 320 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -761,8 +761,8 @@ featured post's `title`.
761761

762762
### Rendering a partial
763763

764-
Partials are allowed to reference themselves, so they must be implemented as new
765-
functions which can reference themselves. This template code:
764+
Partials are allowed to reference themselves, so they must be implemented as
765+
separate functions which can call themselves recursively. This template code:
766766

767767
```html
768768
{{ #posts }}{{ >post }}{{ /posts }}
@@ -804,6 +804,322 @@ separate parameter, so that they are easily accessed by name. `context1` is
804804
accessed in order to write the post's `title`, and `context0` is accessed in
805805
order to write the author's `name`.
806806

807-
### High level design for generating renderers
807+
### Compiler for generating renderers
808+
809+
The AOT compiler is a tool that builds render functions from Mustache templates.
810+
In order to understand the types of Mustache keys encounted in the templates,
811+
the compiler must also know the singular static context type that will be
812+
"rendered into" each template.
813+
814+
The AOT compiler only needs to be executed by a Dartdoc developer, when a
815+
template changes, or when any one of the types that may be rendered into a
816+
template changes, or when the complier changes. The generated renderer functions
817+
are checked in as Dartdoc source code. In other words, the ahead-of-time
818+
compiled renderer functions only need to be compiled when making a change to
819+
Dartdoc. These renderer functions, on the other hand, need to run every single
820+
time Dartdoc runs, generating HTML documentation. Therefore we generally aim to
821+
remove complexity from the renderer functions, even at the cost of added
822+
complexity in the AOT compiler.
823+
824+
#### Basic example
825+
826+
As a basic example of how the compiler chooses what to write into a renderer
827+
function, see the code below. The User class is rendered into the `user.html`
828+
template, as specified in this `@Renderer` annotation:
829+
830+
```dart
831+
@Renderer(#renderUser, Context<User>(), 'user')
832+
```
833+
834+
```dart
835+
abstract class User {
836+
String get name;
837+
Post? get featuredPost;
838+
List<Post> get posts;
839+
}
840+
```
841+
842+
```html
843+
<h1>{{ name }}</h1>
844+
{{ #featuredPost }}{{ >post }}{{ /featuredPost }}
845+
```
846+
847+
The AOT compiler takes the parsed Mustache template, which contains a rendered
848+
variable (`{{ name }}`) and a section (`{{ #featuredPost }}...`).
849+
850+
The first step is to write the function name and parameters. The `@Renderer`
851+
annotation specifies that the public name for the renderer function is
852+
`renderUser`. As a top-level, public render function, there is only one context
853+
variable in the context stack, which is `User`. The only parameter therefore is
854+
`User context0`:
855+
856+
```dart
857+
String renderUser(User context0) {
858+
final buffer = StringBuffer();
859+
// ...
860+
return buffer.toString();
861+
}
862+
```
863+
864+
The compiler looks up the `name` property on `User`, finds that it exists, and
865+
returns a `String`, which is valid for a rendered variable. When generating the
866+
renderer, the compiler can just write to the function's `buffer`.
867+
868+
The compiler then looks up the `featuredPost` property on `User`, finds that it
869+
exists, and returns a nullable `Post`. This means the section is a "value"
870+
section; the compiler writes the renderer to only write to `buffer` if
871+
`context0.featuredPost` is non-`null`. If instead the compiler were to see that
872+
`featuredPost` were a `bool`-typed property, it would write the renderer to
873+
write the section content depending on whether the property is `true` or
874+
`false`. And finally if instead the compiler were to see that `featuredPost`
875+
were an `Iterable`-typed property, it would write the renderer to loop over the
876+
value of the property and write the section repeatedly.
877+
878+
#### Partials
879+
880+
Most of the complexity in the AOT compiler is found in the handling of partials.
881+
The compiler attempts to generate a minimal amount of code for the renderer
882+
functions.
883+
884+
Each partial template is compiled into it's own (private) renderer function,
885+
complete with a name, a list of parameters, and a body. They must be very
886+
flexible in order to satisfy a variety of legal situations allowed by the
887+
Mustache template system:
888+
889+
1. Just as with a top-level template, and as with a section, a partial has
890+
access to the entire context stack.
891+
892+
As a quick example, if a reference to a partial is a point in a template with
893+
3 context variables, then the partial must also have access to those 3
894+
context variables; it will have 3 parameters (modulo the optimizations
895+
below).
896+
897+
2. A partial can reference itself. For this reason, partials are compiled into
898+
their own named functions.
899+
900+
3. A single partial can be referenced by multiple templates, and the context
901+
stacks of these templates may be completely different from each other.
902+
903+
For example two templates may reference one partial, and one may have as the
904+
top context variable a `String`, while the other may have as the top context
905+
variable a `List<int>`. The partial may then contain a rendered variable for
906+
a property named `length`; this is all legal. Therefore, at the outset, it
907+
looks like each _reference_ to a partial, even the same partial, requires
908+
generating a separate renderer function. In this example, one partial
909+
renderer function will take a `String` parameter, and the other will take a
910+
`List<int>` parameter.
911+
912+
(In practice, while a given partial template may be referenced by multiple
913+
templates with different context stacks, the types of corresponding context
914+
variables will typically have LUB types that are more narrow than `Object`
915+
and that can be legally used as parameter types. This allows for
916+
deduplication, and is described below.)
917+
918+
4. A partial may be referenced multiple times from the same template. Again, the
919+
points at which these references occur may have differing context stacks.
920+
This is just another reason that each reference to a partial _may_ require
921+
generating a separate renderer function.
922+
923+
Because we may need to generate a partial function for each _reference_ to a
924+
partial template, they are uniquely named with their call stack. For example, if
925+
the `renderUser` function references the `_post` partial, then the generated
926+
renderer function for that partial is called `_renderUser_partial_post_0`. If it
927+
references that partial twice, the second rendered function is called
928+
`_renderUser_partial_post_1`. If one of these partials references the `_author`
929+
partial, the generated rendered function for that partial is called
930+
`_renderUser_partial_post_0_partial_author_0`. One can see how this can quickly
931+
get out-of-hand, and how this system can really benefit from some optimizations.
932+
933+
#### High level code walkthrough
934+
935+
The AOT compiler is found in `tool/mustachio/codegen_aot_compiler.dart`. The
936+
entrypoint into this code is the top-level `compileTemplatesToRenderers`
937+
function. This function takes a set of `RendererSpec`s (just the info derived
938+
from each `@Renderer` annotation) and returns a single String, the source text
939+
for a Dart library containing all of the compiled renderer functions.
940+
941+
The `compileTemplatesToRenderers` function is fairly simple; it walks over the
942+
`RendererSpec` objects, creating an `_AotCompiler` object for each. The
943+
`_AotCompiler._readAndParse` function takes a context type, a renderer name, a
944+
path to a template, and some extra data, parses the template, and returns an
945+
`_AotCompiler` instance. The `compileTemplatesToRenderers` function then takes
946+
that compiler instance, compiles the template into a renderer function (a String
947+
of Dart source code), and also collects a mapping of partial renderer functions
948+
that were compiled in the process. When the compiler instance compiles its given
949+
template into a renderer, it recursvely creates a compiler instance for each
950+
referenced partial and compiles the reference partial into a renderer function
951+
(see `_BlockCompiler._compilePartial`).
952+
953+
In this way, `compileTemplatesToRenderers` collects all of the compiler
954+
instances and the renderer function source code that has been compiled by each.
955+
Finally, it writes out all of the function source code to one giant
956+
StringBuffer; some import directives are prepended, and everything is ultimately
957+
written to a single file on disk.
958+
959+
We track the mapping of each compiler to the source code it compiled, in order
960+
to perform some optimizations before the final list of renderer functions is
961+
written to the StringBuffer. These are detailed below.
962+
963+
#### Used context stacks
964+
965+
The first optimization in Mustachio's partial renderer function generation is to
966+
strip out unused context stacks.
967+
968+
For example, take the following template and partial:
969+
970+
```html
971+
<!-- home template -->
972+
{{ #loggedInUser }}
973+
{{ #featuredPost }}
974+
{{ #authors }}{{ >author }}{{ /authors }}
975+
{{ /featuredPost }}
976+
{{ /loggedInUser }}
977+
978+
<!-- _author partial -->
979+
{{ name }}
980+
```
808981

809-
TODO(srawlins): Write.
982+
Let's say that some generic `HomePageData` object is rendered into this
983+
template; the `loggedInUser` property has a `User` type; `featuredPost` is a
984+
property on `User`, with a `Post` type; `authors` is a property on `Post` with a
985+
`List<User>`. The `_author` partial template can legally access any property on
986+
the context stack: `User`, `Post`, `User`, `HomePageData`. As per the rules of
987+
Mustache, a renderer must first search the top context type, `User`, for a
988+
property named `name`, and if that is not found, continue down the context
989+
stack.
990+
991+
Without any further investigation, it looks like the renderer function for the
992+
`_author` partial will have 4 parameters, `User context0`, `Post context1`,
993+
`User context2`, and `HomePageData context3`. However, as we know the entire
994+
parsed contents of the partial, we can simplify the list of parameters down to
995+
the ones which are actually _used_.
996+
997+
(The attentive reader will note that right off the bat, if `name` is not found
998+
on the first context variable, a `User`-typed variable, then it's not going to
999+
be found on the third context variable, also a `User`, so we can immediately
1000+
strip out the 3rd parameter; this behavior comes out of the broader optimization
1001+
as well.)
1002+
1003+
In order to reduce the `_author` renderer function's parameters down to the ones
1004+
which are used, we must walk the parsed partial and track the variables on the
1005+
context stack which are used in order to access a variable or a section key. In
1006+
this example where `name` is the only property accessed, and where `name` is a
1007+
property on `User`, we can reduce the number of parameters from 4 down to 1.
1008+
1009+
Note that the `_author` partial template may itself reference other templates.
1010+
If it refers to an `_avatar` partial, and a `_badges` partial, then each of
1011+
those partials _can also legally access_ any variable in the context stack. So
1012+
when walking the parsed `_author` partial, tracking the used variables, we must
1013+
take `_avatar` and `_badges` into account, walking those partials, etc.
1014+
1015+
In practice this can immensely simplify the generated renderers as the vast
1016+
majority of rendered variables and section keys are properties on the top-most
1017+
context variable. This means reducing the number of parameters that each
1018+
renderer function takes and reducing the number of arguments that each renderer
1019+
function needs to pass to partial calls.
1020+
1021+
In the `codegen_aot_compiler.dart` source, here are the steps that carry out
1022+
this optimization:
1023+
1024+
1. The `_AotCompiler._compileToRenderer` function creates a `_BlockCompiler` (a
1025+
class that compiles a single Mustache block into a String) with the current
1026+
context stack, in order to compile the Mustache block that is the top-level
1027+
unit of a template.
1028+
2. The `_BlockCompiler` compiles the block of Mustache into a series of Dart
1029+
statements (as source code), and tracks the referenced context variables in a
1030+
set, `_BlockCompiler._usedContextTypes`.
1031+
3. At this point we have the body of the renderer that we are creating, and its
1032+
name. We write the return type (`String`) and the nameo of the render
1033+
function, and then must write the list of parameters. Instead of writing the
1034+
list of _all_ of the context variables as parameters, we only write the
1035+
_used_ ones, collected up by the `_BlockCompiler` (and any nested
1036+
`_AotCompiler`s and `_BlockCompiler`s that were also created).
1037+
4. (Sometimes type parameters must also be added to the render functions, and
1038+
sometime type arguments must also be added to the parameter types; this is
1039+
omitted here.)
1040+
5. After writing the parameters, we can write the body, and we're done.
1041+
1042+
Note that there is a shortcoming of this implementation in the names of the
1043+
parameters of a partial renderer function. A given `_BlockCompiler` has a
1044+
context stack and a template. The context stack is a list of "variable lookup"
1045+
objects, which each describe a contect variable's type and name. So before the
1046+
block compiler knows what the used context variables are, the names of all
1047+
context variables is hard-coded. The block compiler then generates statements
1048+
for the body of the function, using those variable names. Because of this
1049+
implementation, some partial renderer functions are created with a seemingly
1050+
arbitrary list of parameter names. For a given partial, maybe the 1st and 3rd
1051+
parameters (`context0` and `context2`) in the context stack are unused, and so
1052+
the two parameters left that the function is written to accept are called
1053+
`context1` and `context3`.
1054+
1055+
#### Deduplicating partials
1056+
1057+
The second optimization the AOT compiler makes is to deduplicate the partial
1058+
renderer functions. Generating an entire set of partial functions for every call
1059+
stack of each reference to each partial yields a lot of code. In most cases of
1060+
real Mustache templates, simplification is possible.
1061+
1062+
The idea is based on the Least Upper Bound (LUB) of Dart types. If we generate 3
1063+
renderer functions for a partial template, that each have a context stack with 2
1064+
context variables, we might be able to replace the 3 functions with a new
1065+
function that uses slightly different context stack types. In particular, it is
1066+
often the case that one template refers to a partial with type `A` as the
1067+
topmost context type, and that another template refers to the same partial with
1068+
type `B` as the topmost context type, and that `A` and `B` are closely related
1069+
(for example they share the same base class, which is not `Object`, or one is a
1070+
supertype of the other). So we can often get away with calculating the Least
1071+
Upper Bound of pairwise items in each context stack, creating a new context
1072+
stack. If the context stacks of our 3 renderer functions have types `T1, U1`,
1073+
`T2, U2`, and `T3, U3`, then we can create a new context stack with types
1074+
`LUB(T1, LUB(T2, T3)), LUB(U1, LUB(U2, U3))`. (Given an LUB function that can
1075+
take arbitrarily many types, this can be written `LUB(T1, ..., Tn)` for each
1076+
of `n` context types in the set of context stacks.)
1077+
1078+
Care must be taken however, as using an LUB type may escape beyond the static
1079+
type on which properties have been previously resolved. If the partial compiled
1080+
into the 3 renderer functions above refers to a property `foo`, and the LUB of
1081+
the individual types does not have any property `foo`, then the LUB type does
1082+
not work, and cannot be used. In practice though, this strategy allows us to
1083+
deduplicate many renderer functions for Dartdoc.
1084+
1085+
In the `codegen_aot_compiler.dart` source, here are the steps that carry out
1086+
this optimization:
1087+
1088+
1. After gathering the list of all `_AotCompiler` instances that each compiled a
1089+
renderer function (as Dart source code), we enter `_deduplicateRenderers` to
1090+
deduplicate the list.
1091+
2. This function first creates a new mapping that maps each partial's path to
1092+
the list of compilers that each compiled that partial to a renderer function,
1093+
and walks each entry in the map.
1094+
1. For each partial path and relevant list of compilers, we create a list of
1095+
the "used context stacks"; so the first item in this list is the used
1096+
context stack calculated by the first compiler, etc.
1097+
2. We then calculate the LUB of the types in each position in the list, with
1098+
the `contextStackLub` function. For example, if a list of used context
1099+
stacks has 3 context stacks (derived from 3 compilers), and each context
1100+
stack has 2 context variables, then the result is a context stack, again
1101+
with 2 context variables, such that the first context variable is the LUB
1102+
of the first variable in each of the 3 original context stacks, and the
1103+
second context variable is the LUB of the second variable in each of the 3
1104+
original context stacks. (If the context stacks in the list do not all
1105+
have exactly the same length, we say the "LUB context stack" is `null`,
1106+
and we cannot deduplicate the renderer functions.)
1107+
3. If the context stacks have some valid LUB context stack, then we may be
1108+
able to replace each renderer function that was compiled for this partial
1109+
with a single renderer function that uses the LUB context stack. We
1110+
proceed by creating a new `_AotCompiler` and a fresh, "deduplicated"
1111+
renderer name.
1112+
4. We try to compile the partial with the new deduplicated compiler. It is
1113+
possible that this fails: if the partial depended on properties that were
1114+
available on the individual context stacks, but are unavailable on the LUB
1115+
context stack, then compilation will fail. In this case, we can just keep
1116+
the individual renderer functions.
1117+
5. If the new deduplicated compiler successfully compiles a renderer
1118+
function, we move forward with it: for each replaced compiler, we replace
1119+
its renderer function with a "redirecting" renderer function, that simply
1120+
redirects to a call to the deduplicated renderer function.
1121+
6. In order to reduce the amount of generated code, we can also _remove_ any
1122+
partial renderer functions that were only referenced by _replaced_
1123+
partial renderer functions. This is calculated recursively.
1124+
3. Finally, the new mapping of compilers to compiled renderer functions is
1125+
passed back to the `compileTemplatesToRenderers` to be written out.

0 commit comments

Comments
 (0)