|
| 1 | +- Feature Name: compiler_fence_intrinsics |
| 2 | +- Start Date: 2015-02-19 |
| 3 | +- RFC PR: (leave this empty) |
| 4 | +- Rust Issue: (leave this empty) |
| 5 | + |
| 6 | +# Summary |
| 7 | + |
| 8 | +Add intrinsics for single-threaded memory fences. |
| 9 | + |
| 10 | +# Motivation |
| 11 | + |
| 12 | +Rust currently supports memory barriers through a set of intrinsics, |
| 13 | +`atomic_fence` and its variants, which generate machine instructions and are |
| 14 | +suitable as cross-processor fences. However, there is currently no compiler |
| 15 | +support for single-threaded fences which do not emit machine instructions. |
| 16 | + |
| 17 | +Certain use cases require that the compiler not reorder loads or stores across a |
| 18 | +given barrier but do not require a corresponding hardware guarantee, such as |
| 19 | +when a thread interacts with a signal handler which will run on the same thread. |
| 20 | +By omitting a fence instruction, relatively costly machine operations can be |
| 21 | +avoided. |
| 22 | + |
| 23 | +The C++ equivalent of this feature is `std::atomic_signal_fence`. |
| 24 | + |
| 25 | +# Detailed design |
| 26 | + |
| 27 | +Add four language intrinsics for single-threaded fences: |
| 28 | + |
| 29 | + * `atomic_compilerfence` |
| 30 | + * `atomic_compilerfence_acq` |
| 31 | + * `atomic_compilerfence_rel` |
| 32 | + * `atomic_compilerfence_acqrel` |
| 33 | + |
| 34 | +These have the same semantics as the existing `atomic_fence` intrinsics but only |
| 35 | +constrain memory reordering by the compiler, not by hardware. |
| 36 | + |
| 37 | +The existing fence intrinsics are exported in libstd with safe wrappers, but |
| 38 | +this design does not export safe wrappers for the new intrinsics. The existing |
| 39 | +fence functions will still perform correctly if used where a single-threaded |
| 40 | +fence is called for, but with a slight reduction in efficiency. Not exposing |
| 41 | +these new intrinsics through a safe wrapper reduces the possibility for |
| 42 | +confusion on which fences are appropriate in a given situation, while still |
| 43 | +providing the capability for users to opt in to a single-threaded fence when |
| 44 | +appropriate. |
| 45 | + |
| 46 | +# Alternatives |
| 47 | + |
| 48 | + * Do nothing. The existing fence intrinsics support all use cases, but with a |
| 49 | + negative impact on performance in some situations where a compiler-only fence |
| 50 | + is appropriate. |
| 51 | + |
| 52 | + * Recommend inline assembly to get a similar effect, such as `asm!("" ::: |
| 53 | + "memory" : "volatile")`. LLVM provides an IR item specifically for this case |
| 54 | + (`fence singlethread`), so I believe taking advantage of that feature in LLVM is |
| 55 | + most appropriate, since its semantics are more rigorously defined and less |
| 56 | + likely to yield unexpected (but not necessarily wrong) behavior. |
| 57 | + |
| 58 | +# Unresolved questions |
| 59 | + |
| 60 | +These intrinsics may be better represented with a different name, such as |
| 61 | +`atomic_signal_fence` or `atomic_singlethread_fence`. The existing |
| 62 | +implementation of atomic intrinsics in the compiler precludes the use of |
| 63 | +underscores in their names and I believe it is clearer to refer to this |
| 64 | +construct as a "compiler fence" rather than a "signal fence" because not all use |
| 65 | +cases necessarily involve signal handlers, hence the current choice of name. |
0 commit comments