@@ -255,7 +255,75 @@ but [you can read about those below](#promoted)).
255
255
256
256
# # Representing constants
257
257
258
- *to be written*
258
+ When code has reached the MIR stage, constants can generally come in two forms :
259
+ *MIR constants* ([`mir::Constant`]) and *type system constants* ([`ty::Const`]).
260
+ MIR constants are used as operands : in `x + CONST`, `CONST` is a MIR constant;
261
+ similarly, in `x + 2`, `2` is a MIR constant. Type system constants are used in
262
+ the type system, in particular for array lengths but also for const generics.
263
+
264
+ Generally, both kinds of constants can be "unevaluated" or "already evaluated".
265
+ And unevaluated constant simply stores the `DefId` of what needs to be evaluated
266
+ to compute this result. An evaluated constant (a "value") has already been
267
+ computed; their representation differs between type system constants and MIR
268
+ constants : MIR constants evaluate to a `mir::ConstValue`; type system constants
269
+ evaluate to a `ty::ValTree`.
270
+
271
+ Type system constants have some more variants to support const generics : they
272
+ can refer to local const generic parameters, and they are subject to inference.
273
+ Furthermore, the `mir::Constant::Ty` variant lets us use an arbitrary type
274
+ system constant as a MIR constant; this happens whenever a const generic
275
+ parameter is used as an operand.
276
+
277
+ # ## MIR constant values
278
+
279
+ In general, a MIR constant value (`mir::ConstValue`) was computed by evaluating
280
+ some constant the user wrote. This [const evaluation](../const-eval.md) produces
281
+ a very low-level representation of the result in terms of individual bytes. We
282
+ call this an "indirect" constant (`mir::ConstValue::Indirect`) since the value
283
+ is stored in-memory.
284
+
285
+ However, storing everything in-memory would be awfully inefficient. Hence there
286
+ are some other variants in `mir::ConstValue` that can represent certain simple
287
+ and common values more efficiently. In particular, everything that can be
288
+ directly written as a literal in Rust (integers, floats, chars, bools, but also
289
+ ` "string literals"` and `b"byte string literals"`) has an optimized variant that
290
+ avoids the full overhead of the in-memory representation.
291
+
292
+ # ## ValTrees
293
+
294
+ An evaluated type system constant is a "valtree". The `ty::ValTree` datastructure
295
+ allows us to represent
296
+
297
+ * arrays,
298
+ * many structs,
299
+ * tuples,
300
+ * enums and,
301
+ * most primitives.
302
+
303
+ The most important rule for
304
+ this representation is that every value must be uniquely represented. In other
305
+ words : a specific value must only be representable in one specific way. For example: there is only
306
+ one way to represent an array of two integers as a `ValTree` :
307
+ ` ValTree::Branch(&[ValTree::Leaf(first_int), ValTree::Leaf(second_int)])` .
308
+ Even though theoretically a `[u32; 2]` could be encoded in a `u64` and thus just be a
309
+ ` ValTree::Leaf(bits_of_two_u32)` , that is not a legal construction of `ValTree`
310
+ (and is very complex to do, so it is unlikely anyone is tempted to do so).
311
+
312
+ These rules also mean that some values are not representable. There can be no `union`s in type
313
+ level constants, as it is not clear how they should be represented, because their active variant
314
+ is unknown. Similarly there is no way to represent raw pointers, as addresses are unknown at
315
+ compile-time and thus we cannot make any assumptions about them. References on the other hand
316
+ *can* be represented, as equality for references is defined as equality on their value, so we
317
+ ignore their address and just look at the backing value. We must make sure that the pointer values
318
+ of the references are not observable at compile time. We thus encode `&42` exactly like `42`.
319
+ Any conversion from
320
+ valtree back a to MIR constant value must reintroduce an actual indirection. At codegen time the
321
+ addresses may be deduplicated between multiple uses or not, entirely depending on arbitrary
322
+ optimization choices.
323
+
324
+ As a consequence, all decoding of `ValTree` must happen by matching on the type first and making
325
+ decisions depending on that. The value itself gives no useful information without the type that
326
+ belongs to it.
259
327
260
328
<a name="promoted"></a>
261
329
@@ -283,3 +351,5 @@ See the const-eval WG's [docs on promotion](https://github.com/rust-lang/const-e
283
351
[`ProjectionElem::Deref`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.ProjectionElem.html#variant.Deref
284
352
[`Rvalue`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Rvalue.html
285
353
[`Operand`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Operand.html
354
+ [`mir::Constant`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Constant.html
355
+ [`ty::Const`] : https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Const.html
0 commit comments