-
Notifications
You must be signed in to change notification settings - Fork 59
Representation of enums #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm surprised to see so little activity here. I'm going to stake out a position that:
We also probably want to document (but perhaps not guarantee forever?) our layout rules around |
+1
+1, follow-up question: should we document this as propagating through newtypes? (and what's the exact defintion of newtype for this purpose -- e.g., do PhantomData fields make a difference) late edit: perhaps we'd like to go even further and document it for any aggregate that directly contains one of these types? I propose guaranteeing one additional fact about repr(Rust) enums: an univariant enum should be represented the same as a struct containing the same data as the sole variant. So for example:
This would be particularly useful if we also guaranteed that fully-uninhabited variants are ignored for layout purposes, because it allows type-punning between |
What about EDIT: Is there a general way to phrase which things allow us to do enum layout optimizations ? E.g. invalid representations, memory that's "invalid to use" (padding bytes), etc. |
As I wrote earlier in #13, I don't see any way to repurpose padding for these or other purposes (if it's settled to be padding in the contained type's padding -- while determining the layout of |
I would like to see Newtypes with I agree with @eddyb's distinction in rust-lang/rfcs#2363 that the tag and discriminant are different concepts, and only the RFC 2195 reprs necessarily make the discriminant directly affect layout. I think that In particular, if RFC 2363 is adopted, I am inclined to say that assigning discriminants does not affect the |
Explicit discriminants won't affect a two-variant enum, because only one invalid value is being used, but the choice of assignments does matter when more variants are involved. In particular, if there any gaps in between the discriminants of variants that need invalid values, that will use up invalid values as well. |
Can you elaborate a bit more on why this is the case? On a |
@alercah Yes, it's what the MIR |
Ah, I see. Why do we support specifying discriminants on |
@nox's usecase is matching the tags between two enums for optimization purposes (LLVM generating better code if they match up, without writing any unsafe code). |
That usecase isn't If I write I think this means that we should just deprecate explicitly-specified discriminants on |
@alercah By "without writing any unsafe code" I meant that nothing actually depends on any specific layout decisions for correctness, it's just that if they do match, the code will run slightly faster. |
Ahh, right, I understand. Some testing shows that we don't seem to do these optimizations for Honestly, I had completely forgotten that we expose the discriminants of fieldless Still, even with them being observable, I think that we should opt against making any kind of layout guarantees based on discriminants. I frequently write code like that in this comment, except that I specify (almost always by omission, so they start from 0) overlapping discriminants. A very clever compiler could ignore the values of the discriminants as specified, and realize that it has an opportunity to optimize multiple types together. Consider this: enum First {
A,
B,
}
enum Second {
C,
D,
}
enum FrontTwo {
First(First),
Second(Second),
} Currently, the compiler lays out Now, imagine we have This tells me that there are potential optimizations that even a guarantee of "A fieldless #[repr(Rust)] enum is laid out as if it is just an unspecified integer of size at most |
On the subject of If an enum has exactly a single inhabited variant with a single nonempty field, it is the same as a newtype struct. If it has exactly two inhabited variants, with exactly one nonempty field between them, and that field is of suitable type, then it is niche-optimized. This definition would cover "suitable type" is a bit fuzzy. At the very least, Niko's types are definitely suitable, because of the C ABI, but if we accept nonzero values being guaranteed for niche-filling, |
|
Please document the ABI considerations. In particular, document how Rust C-like enums work w.r.t. |
I think we should give these optimizations a proper name instead of calling them They are optimizations that apply to I think that once we specify exactly what we guarantee about these enum optimizations, we only have to specify how |
Using @eddyb's terminology, these are "niche optimizations" because they use a "niche" (a bunch of values that are never allowed in elements of a type) to store the enum discriminant. |
@alercah currently, Many C APIs return an integer error code, where a value of #[repr(transparent)] struct Error(NonZero<libc::c_int>);
extern "C" {
fn foo() -> Result<(), Error>;
} (edit: the same applies for |
@RalfJung I actually like "sentinel" too. It'd be great to come up with a name to replace "niche" because it has gotten people confused and it's kind of a... niche term. |
While I also like sentinel I don't mind much about using "niche". After going through https://en.wikipedia.org/wiki/Sentinel_value I don't know if the "sentinel" term is also commonly used for multiple values.
The main reason is probably because we don't define this term anywhere yet and because the docs / nomicon / ... do not use it consistently. Whatever term we pick, we should just properly define it in the "enum layout" document and update the book, nomicon, etc. to use the term appropriately. What terms do other languages use ? |
Discussion topic about how enums are represented in Rust. Some things to work out:
#[repr]
options available for enums?#[repr(C)]
enums with payloads.Option<T>
-like layout optimizations?Option<T>
-like layout optimizations?Result<T, ()>
?!
: defined to be 0#[repr(C)
and friends hereThe text was updated successfully, but these errors were encountered: