Skip to content

Commit 648ec02

Browse files
committed
ptr-to-byte
1 parent 499fa9e commit 648ec02

File tree

2 files changed

+33
-10
lines changed

2 files changed

+33
-10
lines changed

wip/memory-interface.md

+28-6
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ The interface is also opinionated in several ways; this is not intended to be ab
1111
For example, it explicitly acknowledges that pointers are not just integers and that uninitialized memory is special (both are true for C and C++ as well but you have to read the standard very careful, and consult non-normative defect report responses, to see this).
1212
Another key property of the interface presented below is that it is *untyped*.
1313
This encodes the fact that in Rust, *operations are typed, but memory is not*---a key difference to C and C++ with their type-based strict aliasing rules.
14+
At the same time, the memory model provides a *side-effect free* way to turn pointers into "raw bytes", which is *not* [the direction C++ is moving towards](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2364.pdf), so we might have to revisit this choice later.
1415

1516
## Pointers
1617

@@ -36,7 +37,7 @@ enum Byte<Pointer> {
3637
/// The pointer of which this is a byte.
3738
ptr: Pointer,
3839
/// Which byte of the pointer this is.
39-
/// `idx` will always be in `0..size_of::<usize>()`.
40+
/// `idx` will always be in `0..PTR_SIZE`.
4041
idx: u8,
4142
}
4243
}
@@ -48,21 +49,39 @@ On a 32-bit system, the sequence of 4 bytes representing `ptr: Pointer` is:
4849
[PtrFragment { ptr, idx: 0 }, PtrFragment { ptr, idx: 1 }, PtrFragment { ptr, idx: 2 }, PtrFragment { ptr, idx: 3 }]
4950
```
5051

52+
Based on the `PtrToInt` trait (see below), we can turn every initialized `Byte` into an integer in `0..256`:
53+
54+
```rust
55+
impl<Pointer: PtrToInt> Byte<Pointer> {
56+
fn as_int(self) -> Option<u8> {
57+
match self {
58+
Byte::Raw(int) => Some(int),
59+
Byte::Uninit => None,
60+
Byte::PtrFragment { ptr, idx } =>
61+
ptr.get_byte(idx),
62+
}
63+
}
64+
}
65+
```
66+
5167
## Memory interface
5268

5369
The Rust memory interface is described by the following (not-yet-complete) trait definition:
5470

5571
```rust
5672
/// *Note*: All memory operations can be non-deterministic, which means that
5773
/// executing the same operation on the same memory can have different results.
58-
/// We also let all operations potentially mutated memory. For example, reads
74+
/// We also let all operations potentially mutate memory. For example, reads
5975
/// actually do change the current state when considering concurrency or
6076
/// Stacked Borrows.
6177
/// And finally, all operations are fallible (they return `Result`); if they
6278
/// fail, that means the program caused UB.
6379
trait Memory {
6480
/// The type of pointer values.
65-
type Pointer;
81+
type Pointer: Copy + PtrToInt;
82+
83+
/// The size of pointer values.
84+
const PTR_SIZE: u64;
6685

6786
/// Create a new allocation.
6887
fn allocate(&mut self, size: u64, align: u64) -> Result<Self::Pointer, Error>;
@@ -79,13 +98,16 @@ trait Memory {
7998
/// Offset the given pointer.
8099
fn offset(&mut self, ptr: Self::Pointer, offset: u64, mode: OffsetMode) -> Result<Self::Pointer, Error>;
81100

82-
/// Cast the given pointer to an integer.
83-
fn ptr_to_int(&mut self, ptr: Self::Pointer) -> Result<u64, Error>;
84-
85101
/// Cast the given integer to a pointer.
86102
fn int_to_ptr(&mut self, int: u64) -> Result<Self::Pointer, Error>;
87103
}
88104

105+
/// The `Pointer` type must know how to extract its bytes, *without any access to the `Memory`*.
106+
trait PtrToInt {
107+
/// Get the `idx`-th byte of the pointer. `idx` must be in `0..PTR_SIZE`.
108+
fn get_byte(self, idx: u8) -> u8;
109+
}
110+
89111
/// The rules applying to this pointer offset operation.
90112
enum OffsetMode {
91113
/// Wrapping offset; never UB.

wip/value-domain.md

+5-4
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44

55
The purpose of this document is to describe what the set of *all possible values* is in Rust.
66
This is an important definition: one key element of a Rust specification will be to define the [representation relation][representation] of every type.
7-
This relation relates values with lists of bytes, so before we can even start specifying the relation we have to specify the involved domains.
7+
This relation relates values with lists of bytes: it says, for a given value and list of bytes, if that value is represented by that list.
8+
However, before we can even start specifying the relation, we have to specify the involved domains.
89
`Byte` is defined as part of [the memory interface][memory-interface]; this document is about defining `Value`.
910

1011
[representation]: https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#representation
@@ -45,7 +46,7 @@ We show some examples for how one might want to use this `Value` domain to defin
4546

4647
### `bool`
4748

48-
The value relation for `bool` relates `Bool(true)` with `[Raw(0)]` and `Bool(false)` with [`Raw(1)`], and that's it.
49+
The value relation for `bool` relates `Bool(b)` with `[bb]` if and only if `bb.as_int() == Some(if b { 1 } else { 0 })`.
4950

5051
### `()`
5152

@@ -72,12 +73,12 @@ It also shows that the actual content of the padding bytes is entirely irrelevan
7273

7374
Reference types are tricky.
7475
But a possible value relation for sized `T` is:
75-
A value `Ptr(ptr)` is related to `[PtrFragment { ptr, idx: 0 }, ..., PtrFragment { ptr, idx: N-1 }]` where `N == size_of::<usize>()` if `ptr` is non-NULL and aligned to `align_of::<T>()`.
76+
A value `Ptr(ptr)` is related to `[PtrFragment { ptr, idx: 0 }, ..., PtrFragment { ptr, idx: PTR_SIZE-1 }]` if `ptr` is non-NULL and appropriately aligned (defining alignment is left open for now).
7677

7778
### `u8`
7879

7980
For the value representation of integer types, there are two different reasonable choices.
80-
Certainly, a value `Int(i)` where `i` in `0..256` is related to `[Raw(i as u8)]`.
81+
Certainly, a value `Int(i)` where `i` in `0..256` is related to `[b]` if `b.as_int() == Some(i)`.
8182

8283
And then, maybe, we also want to additionally say that value `Uninit` is related to `[Uninit]`.
8384
This essentially corresponds to saying that uninitialized memory is a valid representation of a `u8` value (namely, the uninitialized value).

0 commit comments

Comments
 (0)