1
1
# Parallel Compilation
2
2
3
3
As of <!-- date-check --> August 2022, the only stage of the compiler that
4
- is already parallel is codegen. Some other parts of the nightly compiler
5
- have parallel implementations, such as query evaluation, type check and
6
- monomorphization, but there is still a lot of work to be done. The lack of
7
- parallelism at other stages (for example, macro expansion) also represents
8
- an opportunity for improving compiler performance.
4
+ is already parallel is codegen. Some parts of the compiler already have
5
+ parallel implementations, such as query evaluation, type check and
6
+ monomorphization, but the general version of the compiler does not include
7
+ these parallelization functions. ** To try out the current parallel compiler** ,
8
+ one can install rustc from source code with ` parallel-compiler = true ` in
9
+ the ` config.toml ` .
9
10
10
- ** To try out the current parallel compiler ** , one can install rustc from
11
- source code with ` parallel-compiler = true ` in the ` config.toml ` .
11
+ The lack of parallelism at other stages (for example, macro expansion) also
12
+ represents an opportunity for improving compiler performance .
12
13
13
14
These next few sections describe where and how parallelism is currently used,
14
15
and the current status of making parallel compilation the default in ` rustc ` .
@@ -45,9 +46,15 @@ are implemented diferently depending on whether `parallel-compiler` is true.
45
46
| MappedLockGuard | parking_lot::MappedMutexGuard | std::cell::RefMut |
46
47
| MetadataRef | [ ` OwningRef<Box<dyn Erased + Send + Sync>, [u8]> ` ] [ OwningRef ] | [ ` OwningRef<Box<dyn Erased>, [u8]> ` ] [ OwningRef ] |
47
48
48
- - There are currently a lot of global data structures that need to be made
49
- thread-safe. A key strategy here has been converting interior-mutable
50
- data-structures (e.g. ` Cell ` ) into their thread-safe siblings (e.g. ` Mutex ` ).
49
+ - These thread-safe data structures interspersed during compilation can
50
+ cause a lot of lock contention, which actually degrades performance as the
51
+ number of threads increases beyond 4. This inspires us to audit the use
52
+ of these data structures, leading to either refactoring to reduce use of
53
+ shared state, or persistent documentation covering invariants, atomicity,
54
+ and lock orderings.
55
+
56
+ - On the other hand, we still need to figure out what other invariants
57
+ during compilation might not hold in parallel compilation.
51
58
52
59
### WorkLocal
53
60
@@ -64,10 +71,10 @@ can be accessed directly through `Deref::deref`.
64
71
65
72
## Parallel Iterator
66
73
67
- The parallel iterators provided by the [ ` rayon ` ] crate are efficient
68
- ways to achieve parallelization. The current nightly rustc uses (a custom
69
- fork of) [ ` rayon ` ] to run tasks in parallel. The custom fork allows the
70
- execution of DAGs of tasks, not just trees.
74
+ The parallel iterators provided by the [ ` rayon ` ] crate are easy ways
75
+ to implement parallelism. In the current implementation of the parallel
76
+ compiler we use a custom fork of [ ` rayon ` ] to run tasks in parallel.
77
+ * (more information wanted here) *
71
78
72
79
Some iterator functions are implemented in the current nightly compiler to
73
80
run loops in parallel when ` parallel-compiler ` is true.
@@ -124,9 +131,11 @@ When a query `foo` is evaluated, the cache table for `foo` is locked.
124
131
start evaluating.
125
132
- If there * is* another query invocation for the same key in progress, we
126
133
release the lock, and just block the thread until the other invocation has
127
- computed the result we are waiting for. ** Deadlocks are possible** , in which
128
- case ` rustc_query_system::query::job::deadlock() ` will be called to detect
129
- and remove the deadlock and then return cycle error as the query result.
134
+ computed the result we are waiting for. ** Cycle error detection** in the parallel
135
+ compiler requires more complex logic than in single-threaded mode. When
136
+ worker threads in parallel queries stop making progress due to interdependence,
137
+ the compiler uses an extra thread * (named deadlock handler)* to detect, remove and
138
+ report the cycle error.
130
139
131
140
Parallel query still has a lot of work to do, most of which is related to
132
141
the previous ` Data Structures ` and ` Parallel Iterators ` . See [ this tracking issue] [ tracking ] .
@@ -137,22 +146,7 @@ As of <!-- date-check--> May 2022, there are still a number of steps
137
146
to complete before rustdoc rendering can be made parallel. More details on
138
147
this issue can be found [ here] [ parallel-rustdoc ] .
139
148
140
- ## Current Status
141
-
142
- As of <!-- date-check --> May 2022, work on explicitly parallelizing the
143
- compiler has stalled. There is a lot of design and correctness work that needs
144
- to be done.
145
-
146
- As of <!-- date-check --> May 2022, much of this effort is on hold due
147
- to lack of manpower. We have a working prototype with promising performance
148
- gains in many cases. However, there are two blockers:
149
-
150
- - It's not clear what invariants need to be upheld that might not hold in the
151
- face of concurrency. An auditing effort was underway, but seems to have
152
- stalled at some point.
153
-
154
- - There is a lot of lock contention, which actually degrades performance as the
155
- number of threads increases beyond 4.
149
+ ## Resources
156
150
157
151
Here are some resources that can be used to learn more (note that some of them
158
152
are a bit out of date):
0 commit comments