Skip to content

Commit 23569a8

Browse files
authored
Merge pull request #195 from RalfJung/padding
define padding with hypothetical Pad type
2 parents 308c474 + 7ab59d4 commit 23569a8

File tree

2 files changed

+27
-2
lines changed

2 files changed

+27
-2
lines changed

Diff for: reference/src/glossary.md

+20-1
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,24 @@ zero-sized type", to refer to zero-sized types with an alignment requirement of
177177
For example, `()` is a "1-ZST" but `[u16; 0]` is not because it has an alignment
178178
requirement of 2.
179179

180+
#### Padding
181+
[padding]: #padding
182+
183+
*Padding* (of a type `T`) refers to the space that the compiler leaves between fields of a struct or enum variant to satisfy alignment requirements, and before/after variants of a union or enum to make all variants equally sized.
184+
185+
Padding can be though of as `[Pad; N]` for some hypothetical type `Pad` (of size 1) with the following properties:
186+
* `Pad` is valid for any byte, i.e., it has the same validity invariant as `MaybeUninit<u8>`.
187+
* Copying `Pad` ignores the source byte, and writes *any* value to the target byte. Or, equivalently (in terms of Abstract Machine behavior), copying `Pad` marks the target byte as uninitialized.
188+
189+
We can also define padding in terms of the [representation relation]:
190+
A byte at index `i` is a padding byte for type `T` if,
191+
for all values `v` and lists of bytes `b` such that `v` and `b` are related at `T` (let's write this `Vrel_T(v, b)`),
192+
changing `b` at index `i` to any other byte yields a `b'` such `v` and `b'` are related (`Vrel_T(v, b')`).
193+
In other words, the byte at index `i` is entirely ignored by `Vrel_T` (the value relation for `T`), and two lists of bytes that only differ in padding bytes relate to the same value(s), if any.
194+
195+
This definition works fine for product types (structs, tuples, arrays, ...).
196+
The desired notion of "padding byte" for enums and unions is still unclear.
197+
180198
#### Place
181199

182200
A *place* (called "lvalue" in C and "glvalue" in C++) is the result of computing a [*place expression*][place-value-expr].
@@ -200,6 +218,7 @@ Values are ephemeral; they arise during the computation of an instruction but ar
200218
(This is comparable to how run-time data in a program is ephemeral and is only ever persisted in serialized form.)
201219

202220
#### Representation (relation)
221+
[representation relation]: #representation-relation
203222

204223
A *representation* of a [value](#value) is a list of bytes that is used to store or "represent" that value in memory.
205224

@@ -208,7 +227,7 @@ The term "relation" here is used in the mathematical sense: the representation r
208227

209228
The relation should be functional for a fixed list of bytes (i.e., every list of bytes has at most one associated representation).
210229
It is partial in both directions: not all values have a representation (e.g. the mathematical integer `300` has no representation at type `u8`), and not all lists of bytes correspond to a value of a specific type (e.g. lists of the wrong size correspond to no value, and the list consisting of the single byte `0x10` corresponds to no value of type `bool`).
211-
For a fixed value, there can be many representations (e.g., when considering type `#[repr(C)] Pair(u8, u16)`, the second byte is a padding byte so changing it does not affect the value represented by a list of bytes).
230+
For a fixed value, there can be many representations (e.g., when considering type `#[repr(C)] Pair(u8, u16)`, the second byte is a [padding byte][padding] so changing it does not affect the value represented by a list of bytes).
212231

213232
See the [value domain][value-domain] for an example how values and representation relations can be made more precise.
214233

Diff for: wip/value-domain.md

+7-1
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,12 @@ But this means there are likely many false negatives, and the final Rust spec wi
6464

6565
The value relation for the `()` type relates the empty tuple `Tuple([])` (assuming we can use array notation to "match" on `Vec`) with the empty byte list `[]`, and that's it.
6666

67+
### `Pad`
68+
69+
The [hypothetical `Pad` type][pad] relates the empty tuple `Tuple([])` with any byte list of length 1.
70+
71+
[pad]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#padding
72+
6773
### `!`
6874

6975
The value relation for the `!` type is empty: nothing is related to anything at this type.
@@ -130,7 +136,7 @@ We can implement `TypedMemory` for any `Memory` as follows:
130136
In particular, this defines the semantics of a "typed copy": when an assignment is executed in Rust (at some type `T`), or when an argument of type `T` is passed to a function or a return value of type `T` returned from a function, this is called a "typed copy".
131137
Its semantics is basically to do a `typed_read` at `T` followed by a `typed_write` at `T`.
132138
In particular, this means doing a "typed copy" at `T` of some memory that has no valid representation at `T` is undefined behavior!
133-
This also means that for types that have padding, the "typed copy" does not preserve the padding bytes.
139+
This also means that for `Pad` and more generally for types that have padding, the "typed copy" does not preserve the padding bytes.
134140

135141
## Relation to validity invariant
136142

0 commit comments

Comments
 (0)