Skip to content

Commit c423c56

Browse files
TbkhiSparrowLiijieyouxu
authored
Update parallel-rustc.md (rust-lang#1926)
Co-authored-by: SparrowLii <[email protected]> Co-authored-by: Jieyou Xu <[email protected]>
1 parent 5d7107b commit c423c56

File tree

1 file changed

+70
-51
lines changed

1 file changed

+70
-51
lines changed

src/parallel-rustc.md

+70-51
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,46 @@
11
# Parallel Compilation
22

3-
As of <!-- date-check --> August 2022, the only stage of the compiler that
4-
is already parallel is codegen. Some parts of the compiler already have
5-
parallel implementations, such as query evaluation, type check and
6-
monomorphization, but the general version of the compiler does not include
7-
these parallelization functions. **To try out the current parallel compiler**,
8-
one can install rustc from source code with `parallel-compiler = true` in
9-
the `config.toml`.
3+
<div class="warning">
4+
Parallel front-end is currently (as of 2024 November) undergoing significant
5+
changes, this page contains quite a bit of outdated information.
106

11-
The lack of parallelism at other stages (for example, macro expansion) also
12-
represents an opportunity for improving compiler performance.
7+
Tracking issue: <https://github.com/rust-lang/rust/issues/113349>
8+
</div>
139

14-
These next few sections describe where and how parallelism is currently used,
15-
and the current status of making parallel compilation the default in `rustc`.
10+
As of <!-- date-check --> November 2024, most of the rust compiler is now
11+
parallelized.
1612

17-
## Codegen
13+
- The codegen part is executed concurrently by default. You can use the `-C
14+
codegen-units=n` option to control the number of concurrent tasks.
15+
- The parts after HIR lowering to codegen such as type checking, borrowing
16+
checking, and mir optimization are parallelized in the nightly version.
17+
Currently, they are executed in serial by default, and parallelization is
18+
manually enabled by the user using the `-Z threads = n` option.
19+
- Other parts, such as lexical parsing, HIR lowering, and macro expansion, are
20+
still executed in serial mode.
1821

19-
During [monomorphization][monomorphization] the compiler splits up all the code to
22+
<div class="warning">
23+
The follow sections are kept for now but are quite outdated.
24+
</div>
25+
26+
---
27+
28+
[codegen]: backend/codegen.md
29+
30+
## Code Generation
31+
32+
During monomorphization the compiler splits up all the code to
2033
be generated into smaller chunks called _codegen units_. These are then generated by
2134
independent instances of LLVM running in parallel. At the end, the linker
2235
is run to combine all the codegen units together into one binary. This process
23-
occurs in the `rustc_codegen_ssa::base` module.
36+
occurs in the [`rustc_codegen_ssa::base`] module.
37+
38+
[`rustc_codegen_ssa::base`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/index.html
2439

2540
## Data Structures
2641

2742
The underlying thread-safe data-structures used in the parallel compiler
28-
can be found in the `rustc_data_structures::sync` module. These data structures
43+
can be found in the [`rustc_data_structures::sync`] module. These data structures
2944
are implemented differently depending on whether `parallel-compiler` is true.
3045

3146
| data structure | parallel | non-parallel |
@@ -45,34 +60,39 @@ are implemented differently depending on whether `parallel-compiler` is true.
4560
| LockGuard | parking_lot::MutexGuard | std::cell::RefMut |
4661
| MappedLockGuard | parking_lot::MappedMutexGuard | std::cell::RefMut |
4762

48-
- These thread-safe data structures interspersed during compilation can
49-
cause a lot of lock contention, which actually degrades performance as the
50-
number of threads increases beyond 4. This inspires us to audit the use
51-
of these data structures, leading to either refactoring to reduce use of
52-
shared state, or persistent documentation covering invariants, atomicity,
53-
and lock orderings.
63+
- These thread-safe data structures are interspersed during compilation which
64+
can cause lock contention resulting in degraded performance as the number of
65+
threads increases beyond 4. So we audit the use of these data structures
66+
which leads to either a refactoring so as to reduce the use of shared state,
67+
or the authoring of persistent documentation covering the specific of the
68+
invariants, the atomicity, and the lock orderings.
5469

5570
- On the other hand, we still need to figure out what other invariants
5671
during compilation might not hold in parallel compilation.
5772

58-
### WorkLocal
73+
[`rustc_data_structures::sync`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/sync/index.html
5974

60-
`WorkLocal` is a special data structure implemented for parallel compiler.
61-
It holds worker-locals values for each thread in a thread pool. You can only
62-
access the worker local value through the Deref impl on the thread pool it
63-
was constructed on. It will panic otherwise.
75+
### WorkerLocal
6476

65-
`WorkLocal` is used to implement the `Arena` allocator in the parallel
66-
environment, which is critical in parallel queries. Its implementation
67-
is located in the `rustc-rayon-core::worker_local` module. However, in the
68-
non-parallel compiler, it is implemented as `(OneThread<T>)`, whose `T`
77+
[`WorkerLocal`] is a special data structure implemented for parallel compilers. It
78+
holds worker-locals values for each thread in a thread pool. You can only
79+
access the worker local value through the `Deref` `impl` on the thread pool it
80+
was constructed on. It panics otherwise.
81+
82+
`WorkerLocal` is used to implement the `Arena` allocator in the parallel
83+
environment, which is critical in parallel queries. Its implementation is
84+
located in the [`rustc_data_structures::sync::worker_local`] module. However,
85+
in the non-parallel compiler, it is implemented as `(OneThread<T>)`, whose `T`
6986
can be accessed directly through `Deref::deref`.
7087

88+
[`rustc_data_structures::sync::worker_local`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/sync/worker_local/index.html
89+
[`WorkerLocal`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_data_structures/sync/worker_local/struct.WorkerLocal.html
90+
7191
## Parallel Iterator
7292

73-
The parallel iterators provided by the [`rayon`] crate are easy ways
74-
to implement parallelism. In the current implementation of the parallel
75-
compiler we use a custom [fork][rustc-rayon] of [`rayon`] to run tasks in parallel.
93+
The parallel iterators provided by the [`rayon`] crate are easy ways to
94+
implement parallelism. In the current implementation of the parallel compiler
95+
we use a custom [fork][rustc-rayon] of `rayon` to run tasks in parallel.
7696

7797
Some iterator functions are implemented to run loops in parallel
7898
when `parallel-compiler` is true.
@@ -88,10 +108,9 @@ when `parallel-compiler` is true.
88108
| **ModuleItems::par_impl_items**(&self, f: impl Fn(ImplItemId)) | run `f` on all impl items in the module | rustc_middle::hir |
89109
| **ModuleItems::par_foreign_items**(&self, f: impl Fn(ForeignItemId)) | run `f` on all foreign items in the module | rustc_middle::hir |
90110

91-
There are a lot of loops in the compiler which can possibly be
92-
parallelized using these functions. As of <!-- date-check--> August
93-
2022, scenarios where the parallel iterator function has been used
94-
are as follows:
111+
There are a lot of loops in the compiler which can possibly be parallelized
112+
using these functions. As of <!-- date-check--> August 2022, scenarios where
113+
the parallel iterator function has been used are as follows:
95114

96115
| caller | scenario | callee |
97116
| ------------------------------------------------------- | ------------------------------------------------------------ | ------------------------ |
@@ -113,9 +132,9 @@ There are still many loops that have the potential to use parallel iterators.
113132
## Query System
114133

115134
The query model has some properties that make it actually feasible to evaluate
116-
multiple queries in parallel without too much of an effort:
135+
multiple queries in parallel without too much effort:
117136

118-
- All data a query provider can access is accessed via the query context, so
137+
- All data a query provider can access is via the query context, so
119138
the query context can take care of synchronizing access.
120139
- Query results are required to be immutable so they can safely be used by
121140
different threads concurrently.
@@ -135,31 +154,31 @@ When a query `foo` is evaluated, the cache table for `foo` is locked.
135154
the compiler uses an extra thread *(named deadlock handler)* to detect, remove and
136155
report the cycle error.
137156

138-
Parallel query still has a lot of work to do, most of which is related to
139-
the previous `Data Structures` and `Parallel Iterators`. See [this tracking issue][tracking].
157+
The parallel query feature still has implementation to do, most of which is
158+
related to the previous `Data Structures` and `Parallel Iterators`. See [this
159+
open feature tracking issue][tracking].
140160

141161
## Rustdoc
142162

143-
As of <!-- date-check--> November 2022, there are still a number of steps
144-
to complete before rustdoc rendering can be made parallel. More details on
145-
this issue can be found [here][parallel-rustdoc].
163+
As of <!-- date-check--> November 2022, there are still a number of steps to
164+
complete before `rustdoc` rendering can be made parallel (see a open discussion
165+
of [parallel `rustdoc`][parallel-rustdoc]).
146166

147167
## Resources
148168

149-
Here are some resources that can be used to learn more (note that some of them
150-
are a bit out of date):
169+
Here are some resources that can be used to learn more:
151170

171+
- [This IRLO thread by alexchricton about performance][irlo1]
152172
- [This IRLO thread by Zoxc, one of the pioneers of the effort][irlo0]
153173
- [This list of interior mutability in the compiler by nikomatsakis][imlist]
154-
- [This IRLO thread by alexchricton about performance][irlo1]
155174

156175
[`rayon`]: https://crates.io/crates/rayon
157-
[rustc-rayon]: https://github.com/rust-lang/rustc-rayon
158-
[irlo0]: https://internals.rust-lang.org/t/parallelizing-rustc-using-rayon/6606
176+
[Arc]: https://doc.rust-lang.org/std/sync/struct.Arc.html
159177
[imlist]: https://github.com/nikomatsakis/rustc-parallelization/blob/master/interior-mutability-list.md
178+
[irlo0]: https://internals.rust-lang.org/t/parallelizing-rustc-using-rayon/6606
160179
[irlo1]: https://internals.rust-lang.org/t/help-test-parallel-rustc/11503
161-
[tracking]: https://github.com/rust-lang/rust/issues/48685
162180
[monomorphization]: backend/monomorph.md
163181
[parallel-rustdoc]: https://github.com/rust-lang/rust/issues/82741
164-
[Arc]: https://doc.rust-lang.org/std/sync/struct.Arc.html
165182
[Rc]: https://doc.rust-lang.org/std/rc/struct.Rc.html
183+
[rustc-rayon]: https://github.com/rust-lang/rustc-rayon
184+
[tracking]: https://github.com/rust-lang/rust/issues/48685

0 commit comments

Comments
 (0)