Skip to content

Commit d0c9729

Browse files
committed
add basic memory interface
1 parent d86d39c commit d0c9729

File tree

1 file changed

+110
-0
lines changed

1 file changed

+110
-0
lines changed

wip/memory-interface.md

+110
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# Rust Memory Interface
2+
3+
**Note:** This document is not normative nor endorsed by the UCG WG. Its purpose is to be the basis for discussion and to set down some key terminology.
4+
5+
The purpose of this document is to describe the interface between a Rust program and memory.
6+
This interface is a key part of the Rust Abstract Machine: it lets us separate concerns by splitting the Machine (i.e., its specification) into two pieces, connected by this well-defined interface:
7+
* The *expression/statement semantics* of Rust boils down to explaining which "memroy events" (calls to the memory interface) happen in which order.
8+
* The Rust *memory model* explains which interactions with the memory are legal (the others are UB), and which values can be returned by reads.
9+
10+
The interface is also opinionated in several ways; this is not intended to be able to support *any imaginable* memory model, but rather start the process of reducing the design space of what we consider a "reasonable" memory model for Rust.
11+
For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult non-normative defect report responses, to see this).
12+
Another key property of the interface presented below is that it is *untyped*.
13+
This encodes the fact that in Rust, *operations are typed, but memory is not*---a key difference to C and C++ with their type-based strict aliasing rules.
14+
15+
## Pointers
16+
17+
One key question a memory model has to answer is *what is a pointer*.
18+
It might seem like the answer is just "an integer of appropriate size", but [that is not the case][pointers-complicated].
19+
So we will leave this question open, and treat `Pointer` as an "associated type" of the memory interface
20+
21+
## Bytes
22+
23+
The unit of communication between the memory model and the rest of the program is a *byte*.
24+
Again the question of "what is a byte" is not as trivial as it might seem; beyond `u8` values we have to represent `Pointer`s and [uninitialized memory][uninit].
25+
We define the `Byte` type (in terms of an arbitrary `Pointer` type) as follows:
26+
27+
```rust
28+
enum Byte<Pointer> {
29+
/// The "normal" case: a (frozen, initialized) integer in `0..256`.
30+
Raw(u8),
31+
/// An uninitialized byte.
32+
Uninit,
33+
/// One byte of a pointer.
34+
Pointer {
35+
/// The pointer of which this is a byte.
36+
ptr: Pointer,
37+
/// Which byte of the pointer this is.
38+
/// `idx` will always be in `0..size_of::<usize>()`.
39+
idx: u8,
40+
}
41+
}
42+
```
43+
44+
## Memory interface
45+
46+
The Rust memory interface is described by the following (not-yet-complete) trait definition:
47+
48+
```rust
49+
/// *Note*: All memory operations can be non-deterministic, which means that
50+
/// executing the same operation on the same memory can have different results.
51+
/// We also let all operations potentially mutated memory. For example, reads
52+
/// actually do change the current state when considering concurrency or
53+
/// Stacked Borrows.
54+
trait Memory {
55+
/// The type of pointer values.
56+
type Pointer;
57+
58+
/// The type of memory errors (i.e., ways in which the program can cause UB
59+
/// by interacting with memory).
60+
type Error;
61+
62+
/// Create a new allocation.
63+
fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer, Self::Error>;
64+
65+
/// Remove an allocation.
66+
fn deallocate(&mut self, ptr: Self::Pointer, size: u64, align: u64) -> Result<(), Self::Error>;
67+
68+
/// Write some bytes to memory.
69+
fn write(&mut self, ptr: Self::Pointer, bytes: Vec<Byte<Self::Pointer>>) -> Result<(), Self::Error>;
70+
71+
/// Read some bytes from memory.
72+
fn read(&mut self, ptr: Self::Pointer, len: u64) -> Result<Vec<Byte<Self::Pointer>>, Self::Error>;
73+
74+
/// Offset the given pointer.
75+
fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode) -> Result<Self::Pointer, Self::Error>;
76+
77+
/// Cast the given pointer to an integer.
78+
fn ptr_to_int(&mut self, ptr: Self::Pointer) -> Result<u64, Self::Error>;
79+
80+
/// Cast the given integer to a pointer.
81+
fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Self::Error>;
82+
}
83+
84+
/// The rules applying to this pointer offset operation.
85+
enum OffsetMode {
86+
/// Wrapping offset; never UB.
87+
Wrapping,
88+
/// Non-wrapping offset; UB if it wraps.
89+
NonWrapping,
90+
/// In-bounds offset; UB if it wraps or if old and new pointer are not both
91+
/// in-bounds of the same allocation (details are specified by the memory
92+
/// model).
93+
Inbounds,
94+
}
95+
```
96+
97+
This is a very basic memory interface that is incomplete in at least the following ways:
98+
99+
* To implement rules like "dereferencing a null, unaligned, or dangling raw pointer is UB" (even if no memory access happens), there needs to be a way to do an "alignment, bounds and null-check".
100+
* There needs to be some way to do alignment checks -- either using the above operation, or by adding `align` parameters to `read` and `write`.
101+
* To represent concurrency, many operations need to take a "thread ID" and `read` and `write` need to take an [`Ordering`].
102+
* To represent [Stacked Borrows], there needs to be a "retag" operation, and that one will in fact be "lightly typed" (it cares about `UnsafeCell`).
103+
* Maybe we want operations that can compare pointers without casting them to integers.
104+
105+
But I think it can still be useful to provide some basic terminology and grounds for further discussion.
106+
107+
[pointers-complicated]: https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html
108+
[uninit]: https://www.ralfj.de/blog/2019/07/14/uninit.html
109+
[`Ordering`]: https://doc.rust-lang.org/nightly/core/sync/atomic/enum.Ordering.html
110+
[Stacked Borrows]: stacked-borrows.md

0 commit comments

Comments
 (0)