Skip to content

Commit f4bcf2b

Browse files
committed
rewrite bootstrapping stages
1 parent 5be5475 commit f4bcf2b

File tree

1 file changed

+68
-48
lines changed

1 file changed

+68
-48
lines changed

Diff for: src/building/bootstrapping.md

+68-48
Original file line numberDiff line numberDiff line change
@@ -2,24 +2,39 @@
22

33
<!-- toc -->
44

5+
[Bootstrapping][boot] is the process of using a compiler to compile a later version
6+
of itself.
57

6-
[*Bootstrapping*][boot] is the process of using a compiler to compile itself.
7-
More accurately, it means using an older compiler to compile a newer version
8-
of the same compiler.
8+
This raises a [chicken-or-the-egg
9+
paradox](https://en.wikipedia.org/wiki/Chicken_or_the_egg): what rust compiler
10+
was used to compile the very first rust compiler? The answer is that the first
11+
compiler was not written in rust. It was [written in OCaml][ocaml-compiler]. Of
12+
course, it has long been discarded and since then the only compiler that is
13+
able to compile some version of `rustc` is a slightly earlier version of
14+
`rustc`.
915

10-
This raises a chicken-and-egg paradox: where did the first compiler come from?
11-
It must have been written in a different language. In Rust's case it was
12-
[written in OCaml][ocaml-compiler]. However it was abandoned long ago and the
13-
only way to build a modern version of rustc is a slightly less modern
14-
version.
16+
For this purpose a python script `x.py` is provided at the root of the
17+
repository. `x.py` downloads a pre-compiled compiler—the stage 0 compiler—and
18+
with it compiles from the current source code a compiler—the stage 1 compiler.
19+
Additionaly, it may use the stage 1 compiler to compile from the current source
20+
code another compiler—the stage 2 compiler. Below describes this process in
21+
some detail, including the reason for a stage 2 compiler and more.
1522

16-
This is exactly how `x.py` works: it downloads the current beta release of
17-
rustc, then uses it to compile the new compiler.
23+
## The stages of bootstrapping
1824

19-
## Stages of bootstrapping
25+
Each stage involves:
26+
- An existing compiler and its set of dependencies.
27+
- Targets ([objects][objects]): `std` and `rustc`.
2028

21-
Compiling `rustc` is done in stages. Here's a diagram, adapted from Joshua Nelson's
22-
[talk on bootstrapping][rustconf22-talk] at RustConf 2022, with detailed explanations below.
29+
Note: the compiler of a stage—e.g. "the stage 1 compiler"—refers to the
30+
compiler that is compiled at that stage, not the one that already exists.
31+
32+
In the first stage (stage 0) the compiler is usually obtained by downloading a
33+
pre-compiled one. In following stages the compiler is usually the one that was
34+
compiled in the previous stage.
35+
36+
Here's a diagram, adapted from Joshua Nelson's [talk on bootstrapping][rustconf22-talk]
37+
at RustConf 2022, with detailed explanations below.
2338

2439
The `A`, `B`, `C`, and `D` show the ordering of the stages of bootstrapping.
2540
<span style="background-color: lightblue; color: black">Blue</span> nodes are downloaded,
@@ -47,54 +62,59 @@ graph TD
4762
classDef with-s1c fill: lightgreen;
4863
```
4964

50-
### Stage 0
65+
[objects]: https://en.wikipedia.org/wiki/Object_code
66+
67+
### The stages: how each compiler is obtained
68+
69+
#### Stage 0: the pre-compiled compiler
5170

52-
The stage0 compiler is usually the current _beta_ `rustc` compiler
53-
and its associated dynamic libraries,
54-
which `x.py` will download for you.
55-
(You can also configure `x.py` to use something else.)
71+
A pre-compiled compiler and its set of dependencies are downloaded. By default,
72+
it is the current beta release. This is the stage 0 compiler.
5673

57-
The stage0 compiler is then used only to compile `rustbuild`, `std`, and `rustc`.
58-
When compiling `rustc`, the stage0 compiler uses the freshly compiled `std`.
59-
There are two concepts at play here:
60-
a compiler (with its set of dependencies)
61-
and its 'target' or 'object' libraries (`std` and `rustc`).
62-
Both are staged, but in a staggered manner.
74+
#### Stage 1: from current code, by an earlier compiler
6375

64-
### Stage 1
76+
The stage 0 compiler compiles from current code `rustbuild` and `std` and uses
77+
them to compile from current code a compiler. This is the stage 1 compiler.
6578

66-
The rustc source code is then compiled with the stage0 compiler to produce the stage1 compiler.
79+
The stage 1 compiler is the first that is from current code. Yet, it is not
80+
entirely up-to-date, because the compiler that compiled it is of earlier code.
81+
More on this below.
6782

68-
### Stage 2
83+
#### Stage 2: the truly current compiler
6984

70-
We then rebuild our stage1 compiler with itself to produce the stage2 compiler.
85+
By default, the stage 1 libraries are copied into stage 2, because they are
86+
expected to be identical.
7187

72-
In theory, the stage1 compiler is functionally identical to the stage2 compiler,
73-
but in practice there are subtle differences.
74-
In particular, the stage1 compiler itself was built by stage0
75-
and hence not by the source in your working directory.
76-
This means that the symbol names used in the compiler source
77-
may not match the symbol names that would have been made by the stage1 compiler,
78-
which can cause problems for dynamic libraries and tests.
88+
The stage 1 compiler is used to compile from current code a compiler. This is
89+
the stage 2 compiler.
7990

80-
The `stage2` compiler is the one distributed with `rustup` and all other install methods.
81-
However, it takes a very long time to build
82-
because one must first build the new compiler with an older compiler
83-
and then use that to build the new compiler with itself.
84-
For development, you usually only want the `stage1` compiler,
85-
which you can build with `./x.py build library`.
91+
The stage 2 compiler is the first that is both from current code and compiled
92+
by a compiler that is of current code. The compilers and libraries obtained by
93+
`rustup` and other installation methods are all stage 2.
94+
95+
For most purposes a stage 1 compiler would suffice: `x.py build library`.
8696
See [Building the Compiler](./how-to-build-and-run.html#building-the-compiler).
97+
Between the stage 2 and the stage 1 compiler are subtle differences:
98+
99+
The symbol names used in the compiler source may not match the symbol names
100+
that would have been made by the stage1 compiler. This is important when using
101+
dynamic linking and due to the lack of ABI compatibility between versions. This
102+
primarily manifests when tests try to link with any of the `rustc_*` crates or
103+
use the (now deprecated) plugin infrastructure. These tests are marked with
104+
`ignore-stage1`.
105+
106+
Also, the stage 2 compiler benefits from the compile-time optimizations
107+
generated by a compiler that is of the current code.
87108

88-
### Stage 3
109+
#### Stage 3: the same-result test
89110

90-
Stage 3 is optional. To sanity check our new compiler, we
91-
can build the libraries with the stage2 compiler. The result ought
92-
to be identical to before, unless something has broken.
111+
To verify that the stage 2 libraries that were copied from stage 1 are indeed
112+
identical to those which would otherwise have been compiled in stage 2, the
113+
stage 2 compiler is used to compile them and a comparison is made.
93114

94115
### Building the stages
95116

96-
`x.py` tries to be helpful and pick the stage you most likely meant for each subcommand.
97-
These defaults are as follows:
117+
`x.py` provides a reasonable default stage for each subcommand:
98118

99119
- `check`: `--stage 0`
100120
- `doc`: `--stage 0`
@@ -104,7 +124,7 @@ These defaults are as follows:
104124
- `install`: `--stage 2`
105125
- `bench`: `--stage 2`
106126

107-
You can always override the stage by passing `--stage N` explicitly.
127+
Of course, these can be overridden by passing `--stage <number>`.
108128

109129
For more information about stages, [see below](#understanding-stages-of-bootstrap).
110130

0 commit comments

Comments
 (0)