Skip to content

Commit 02a966c

Browse files
RalfJungtshepang
authored andcommitted
explain the MIR const vs TY const situation
1 parent 6b347d2 commit 02a966c

File tree

2 files changed

+75
-46
lines changed

2 files changed

+75
-46
lines changed

src/const-eval.md

+4-45
Original file line numberDiff line numberDiff line change
@@ -40,52 +40,11 @@ in which the constant is evaluated (e.g. the function within which the constant
4040
and a [`GlobalId`]. The `GlobalId` is made up of an `Instance` referring to a constant
4141
or static or of an `Instance` of a function and an index into the function's `Promoted` table.
4242

43-
Constant evaluation returns an [`EvalToValTreeResult`] for type system constants or
44-
[`EvalToConstValueResult`] with either the error, or a representation of the constant.
45-
46-
Constants for the type system are encoded in "valtree representation". The `ValTree` datastructure
47-
allows us to represent
48-
49-
* arrays,
50-
* many structs,
51-
* tuples,
52-
* enums and,
53-
* most primitives.
54-
55-
The basic rule for
56-
being permitted in the type system is that every value must be uniquely represented. In other
57-
words: a specific value must only be representable in one specific way. For example: there is only
58-
one way to represent an array of two integers as a `ValTree`:
59-
`ValTree::Branch(&[ValTree::Leaf(first_int), ValTree::Leaf(second_int)])`.
60-
Even though theoretically a `[u32; 2]` could be encoded in a `u64` and thus just be a
61-
`ValTree::Leaf(bits_of_two_u32)`, that is not a legal construction of `ValTree`
62-
(and is very complex to do, so it is unlikely anyone is tempted to do so).
63-
64-
These rules also mean that some values are not representable. There can be no `union`s in type
65-
level constants, as it is not clear how they should be represented, because their active variant
66-
is unknown. Similarly there is no way to represent raw pointers, as addresses are unknown at
67-
compile-time and thus we cannot make any assumptions about them. References on the other hand
68-
*can* be represented, as equality for references is defined as equality on their value, so we
69-
ignore their address and just look at the backing value. We must make sure that the pointer values
70-
of the references are not observable at compile time. We thus encode `&42` exactly like `42`.
71-
Any conversion from
72-
valtree back to codegen constants must reintroduce an actual indirection. At codegen time the
73-
addresses may be deduplicated between multiple uses or not, entirely depending on arbitrary
74-
optimization choices.
75-
76-
As a consequence, all decoding of `ValTree` must happen by matching on the type first and making
77-
decisions depending on that. The value itself gives no useful information without the type that
78-
belongs to it.
79-
80-
Other constants get represented as [`ConstValue::Scalar`] or
81-
[`ConstValue::Slice`] if possible. These values are only useful outside the
82-
compile-time interpreter. If you need the value of a constant during
83-
interpretation, you need to directly work with [`const_to_op`].
43+
Constant evaluation returns an [`EvalToValTreeResult`] for type system constants
44+
or [`EvalToConstValueResult`] with either the error, or a representation of the
45+
evaluated constant: a [valtree](mir/index.md#valtrees) or a [MIR constant
46+
value](mir/index.md#mir-constant-values), respectively.
8447

8548
[`GlobalId`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/struct.GlobalId.html
86-
[`ConstValue::Scalar`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/value/enum.ConstValue.html#variant.Scalar
87-
[`ConstValue::Slice`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/value/enum.ConstValue.html#variant.Slice
88-
[`ConstValue::ByRef`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/value/enum.ConstValue.html#variant.ByRef
8949
[`EvalToConstValueResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/error/type.EvalToConstValueResult.html
9050
[`EvalToValTreeResult`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/interpret/error/type.EvalToValTreeResult.html
91-
[`const_to_op`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_const_eval/interpret/struct.InterpCx.html#method.const_to_op

src/mir/index.md

+71-1
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,75 @@ but [you can read about those below](#promoted)).
255255

256256
## Representing constants
257257

258-
*to be written*
258+
When code has reached the MIR stage, constants can generally come in two forms:
259+
*MIR constants* ([`mir::Constant`]) and *type system constants* ([`ty::Const`]).
260+
MIR constants are used as operands: in `x + CONST`, `CONST` is a MIR constant;
261+
similarly, in `x + 2`, `2` is a MIR constant. Type system constants are used in
262+
the type system, in particular for array lengths but also for const generics.
263+
264+
Generally, both kinds of constants can be "unevaluated" or "already evaluated".
265+
And unevaluated constant simply stores the `DefId` of what needs to be evaluated
266+
to compute this result. An evaluated constant (a "value") has already been
267+
computed; their representation differs between type system constants and MIR
268+
constants: MIR constants evaluate to a `mir::ConstValue`; type system constants
269+
evaluate to a `ty::ValTree`.
270+
271+
Type system constants have some more variants to support const generics: they
272+
can refer to local const generic parameters, and they are subject to inference.
273+
Furthermore, the `mir::Constant::Ty` variant lets us use an arbitrary type
274+
system constant as a MIR constant; this happens whenever a const generic
275+
parameter is used as an operand.
276+
277+
### MIR constant values
278+
279+
In general, a MIR constant value (`mir::ConstValue`) was computed by evaluating
280+
some constant the user wrote. This [const evaluation](../const-eval.md) produces
281+
a very low-level representation of the result in terms of individual bytes. We
282+
call this an "indirect" constant (`mir::ConstValue::Indirect`) since the value
283+
is stored in-memory.
284+
285+
However, storing everything in-memory would be awfully inefficient. Hence there
286+
are some other variants in `mir::ConstValue` that can represent certain simple
287+
and common values more efficiently. In particular, everything that can be
288+
directly written as a literal in Rust (integers, floats, chars, bools, but also
289+
`"string literals"` and `b"byte string literals"`) has an optimized variant that
290+
avoids the full overhead of the in-memory representation.
291+
292+
### ValTrees
293+
294+
An evaluated type system constant is a "valtree". The `ty::ValTree` datastructure
295+
allows us to represent
296+
297+
* arrays,
298+
* many structs,
299+
* tuples,
300+
* enums and,
301+
* most primitives.
302+
303+
The most important rule for
304+
this representation is that every value must be uniquely represented. In other
305+
words: a specific value must only be representable in one specific way. For example: there is only
306+
one way to represent an array of two integers as a `ValTree`:
307+
`ValTree::Branch(&[ValTree::Leaf(first_int), ValTree::Leaf(second_int)])`.
308+
Even though theoretically a `[u32; 2]` could be encoded in a `u64` and thus just be a
309+
`ValTree::Leaf(bits_of_two_u32)`, that is not a legal construction of `ValTree`
310+
(and is very complex to do, so it is unlikely anyone is tempted to do so).
311+
312+
These rules also mean that some values are not representable. There can be no `union`s in type
313+
level constants, as it is not clear how they should be represented, because their active variant
314+
is unknown. Similarly there is no way to represent raw pointers, as addresses are unknown at
315+
compile-time and thus we cannot make any assumptions about them. References on the other hand
316+
*can* be represented, as equality for references is defined as equality on their value, so we
317+
ignore their address and just look at the backing value. We must make sure that the pointer values
318+
of the references are not observable at compile time. We thus encode `&42` exactly like `42`.
319+
Any conversion from
320+
valtree back a to MIR constant value must reintroduce an actual indirection. At codegen time the
321+
addresses may be deduplicated between multiple uses or not, entirely depending on arbitrary
322+
optimization choices.
323+
324+
As a consequence, all decoding of `ValTree` must happen by matching on the type first and making
325+
decisions depending on that. The value itself gives no useful information without the type that
326+
belongs to it.
259327

260328
<a name="promoted"></a>
261329

@@ -283,3 +351,5 @@ See the const-eval WG's [docs on promotion](https://github.com/rust-lang/const-e
283351
[`ProjectionElem::Deref`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.ProjectionElem.html#variant.Deref
284352
[`Rvalue`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Rvalue.html
285353
[`Operand`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/enum.Operand.html
354+
[`mir::Constant`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/struct.Constant.html
355+
[`ty::Const`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Const.html

0 commit comments

Comments
 (0)