-
Notifications
You must be signed in to change notification settings - Fork 59
Added representation of pointer types. #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
13da89f
049dc62
ab527c7
872a405
be86083
db5f98d
e2e6f0a
45d6eee
7fb3b93
fc39320
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Representation of reference and pointer types | ||
|
||
### Terminology | ||
|
||
Reference types are types of the form `&T` or `&mut T`. | ||
|
||
Raw pointer types are types of the form `*const T` or `*mut T`. | ||
|
||
### Representation | ||
|
||
The alignment of reference and raw pointer types is the word size. | ||
|
||
The sizes of `&T`, `&mut T`, `*const T` and `*mut T` are the same, | ||
and are at least one word. | ||
|
||
* If `T` is a trait, then the size of `&T` is two words. | ||
gnzlbg marked this conversation as resolved.
Show resolved
Hide resolved
asajeffrey marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* If `T` is a sized type then the size of `&T` is one word. | ||
* The size of `&[T]` is two words. | ||
|
||
### Notes | ||
|
||
The representations of `&T` and `&mut T` are the same. | ||
|
||
The representation of `&T` when `T` is a trait is the same as that of: | ||
```rust | ||
#[repr(C)] | ||
struct DynObject { | ||
data: &u8, | ||
vtable: &usize, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To emphasize the fact that it's non-zero and word-aligned. Not part of the representation, but I believe we're going to be requiring this when we come to validity. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like a very confusing way to make that point, and since it's not even really part of what this write-up is supposed to be about at this point, nor particularly useful, I'd really prefer not doing that. Just There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made them both |
||
} | ||
``` | ||
|
||
The representation of `&[T]` is the same as that of: | ||
```rust | ||
#[repr(C)] | ||
struct Slice<T> { | ||
ptr: &T, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Question: if the length is zero, then what is
It seems worth linking to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While I don't like the use of references here much [0] the guarantee here is that "the representation of Ideally, we would use the same examples to show validity and representation, but we haven't settled the validity of these yet (it might well be that we need different "examples" for the [0] Personally, I don't like the use of The representation of struct Slice {
ptr: *const (),
len: usize,
} AFAICT there is no need to make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Like I said above, we could also use struct Slice<T> {
ptr: &[T;N],
len: usize,
} and in practice, All these defns are equivalent, it's just a case of which is the more human-readable. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added a comment that |
||
len: usize, | ||
} | ||
``` | ||
|
||
The validity requirements of `&T` include that all values are non-null, which | ||
gnzlbg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
impacts niche optimizations and hence representation of types which include `&T`. | ||
In particular, `Option<&T>` is one word. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized this might not actually be true for custom DSTs (e.g., consider a custom DST with
u64
metadata on a 32 bit target). Since the rest of this text is already careful to be forward compatible with custom DSTs, it would be a shame if this part wasn't.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, worth thinking about. On 32-bit targets, does
u64
have 64-bit alignment? Are there architectures with alignment larger than the word size?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, u64 is not a good example, because on most (all?) 32 bit targets it has alignment 4. But there are definitely tons of types with super-word alignment -- anyone can define them with
#[repr(align(N))]
, and vector types such as__m128
usually require natural alignment (i.e., size = align).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the question is do we want to tie our hands and say that metadata of custom DSTs is at most word-aligned? The trade-offs don't seem obvious to me, I'm trying to come up with an example and the best I can do is something like:
The question is can this be insta-UB if
U
s metadata is super-word aligned?This is the simplest example I can think of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing metadata with higher alignment comes naturally, as pointers to custom DSTs are just
(<untyped data pointer>, T::Metadata)
. Is there any reason to artifically prohibit it? All of the code I know of that can be generic over all DSTs doesn't have to care about alignment or can trivially be made compatible with higher alignments by strategically usingmem::align_of::<&T>
(andT
has to be known anyway if you're manipulating pointers, because it dictates the size of the pointer too).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My two cents is that we shouldn't say anything about the size and alignment of references here.
We should just say that the layout of references (
&T
and&mut T
) and pointers (*const T
and*mut T
) is the same as the layout of the following type:We can then say that:
T: Sized
, thenMetadata === ()
,T
is a slice, thenMetadata === usize
,T
is a trait, thenMetadata === *const ()
,T
is a multi-trait object, thenMetadata === <implementation-defined>
,T
is a custom DST, thenMetadata === T::Metadata
,and the layout of pointers and references just follows from the
struct
representation rules, e.g., that they are at least one "word" wide follows fromLayout
always being at least one word wide.Since I've used raw pointers, we have to say something about the size and alignment of raw pointers here, but we already said that their size is the same as
usize
, and for alignment we can just say thatthe raw pointer alignment is an implementation-defined multiple ofit's the same as their size.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we have custom DSTs in the language, then all pointers are
... { ptr: *mut (), metadata: T::Metadata }
, as custom DSTs generalize slices, trait objects, and sized types (we can still spell this out for clarity but there's no need to have two things calledMetadata
or to introduce this===
relation). But we don't have custom DSTs yet, so it feels very weird to define layout in terms of a concept the language doesn't have yet, even if we can technically avoid any reference to a specific non-existentDynamicallySized
trait.Wait, what? If the alignment is not described yet, it should be described if we can at all get consensus for it. As far as I know, that alignment should be exactly the size. But even if that is controversial, it's the other way around, size has to be a multiple of the alignment (because element i of an array is at byte offset
i * sizeof(T)
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without the concept of custom DSTs, we do have the concept of generic structs, and their layout, and those can be used to specify the layout of slices and trait objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but unifying these two layouts into one generic struct, and the naming of the type parameter, gestures heavily at the concept of custom DSTs (would anyone make this suggestion if they didn't know of the generalization that custom DSTs bring?), without really buying us anything right now. If anything, it makes the presentation slightly more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I reworded this to say that reference types have at least word alignment, and have exactly word alignment in the case of references to sized types,
&[T]
,&str
or&dyn Trait
.