Skip to content

Detail layout of repr(C) unions #160

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Aug 15, 2019
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 80 additions & 10 deletions reference/src/layout/unions.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,6 @@ is already entirely determined by their types, and since we intend to allow
creating references to fields (`&u.f1`), unions do not have any wiggle-room
there.

### C-compatible layout ("repr C")

For unions tagged `#[repr(C)]`, the compiler will apply the C layout scheme. Per
sections [6.5.8.5] and [6.7.2.1.16] of the C11 specification, this means that
the offset of every field is 0. Unsafe code can cast a pointer to the union to
a field type to obtain a pointer to any field, and vice versa.

[6.5.8.5]: http://port70.net/~nsz/c/c11/n1570.html#6.5.8p5
[6.7.2.1.16]: http://port70.net/~nsz/c/c11/n1570.html#6.7.2.1p16

### Default layout ("repr rust")

**The default layout of unions is not specified.** As of this writing, we want
Expand All @@ -38,3 +28,83 @@ contents are.
Even if the offsets happen to be all 0, there might still be differences in the
function call ABI. If you need to pass unions by-value across an FFI boundary,
you have to use `#[repr(C)]`.

### Layout of "repr C" unions

The layout of `repr(C)` unions follows the C layout scheme. Per sections
[6.5.8.5] and [6.7.2.1.16] of the C11 specification, this means that the offset
of every field is 0. Unsafe code can cast a pointer to the union to a field type
to obtain a pointer to any field, and vice versa.

[6.5.8.5]: http://port70.net/~nsz/c/c11/n1570.html#6.5.8p5
[6.7.2.1.16]: http://port70.net/~nsz/c/c11/n1570.html#6.7.2.1p16

#### Padding

Since all fields are at offset 0, `repr(C)` unions do not have padding before
their fields. They can, however, have trailing padding, to make sure the size is
a multiple of the alignment:

```rust
# use std::mem::{size_of, align_of};
#[repr(C, align(2))]
union U { x: u8 }
# fn main() {
// The repr(align) attribute raises the alignment requirement of U to 2
assert_eq!(align_of::<U>(), 2);
// This introduces trailing padding, raising the union size to 2
assert_eq!(size_of::<U>(), 2);
# }
``**

**Note**: there is no room between fields for padding, so `repr(C)` unions can
only have trailing padding.

The bit `i` of a `repr(C)` union is a padding bit if the bit `i` of each of its
fields is a padding bit or trailing padding. That is:

```rust
#[repr(C)]
union U { x: (u8, u16) }
```

The byte at offset 1 of `U` is a padding byte.

#### Zero-sized fields

If a `#[repr(C)]` union contains a field of zero-size, that field does not
occupy space in the union. For example:

```rust
# use std::mem::{size_of, align_of};
#[repr(C)]
union U {
x: u8,
y: (),
}
# fn main() {
assert_eq!(size_of::<U>(), 1);
# }
```

The field does, however, participate in the layout computation of the union. For
example:

```rust
# use std::mem::{size_of, align_of};
#[repr(C)]
union U {
x: u8,
y: [u16; 0],
}
# fn main() {
// The zero-sized type [u16; 0] raises the alignment requirement to 2
assert_eq!(align_of::<U>(), 2);
// This introduces trailing padding, raising the union size to 2
assert_eq!(size_of::<U>(), 2);
# }
```

This handling of zero-sized types is equivalent to the handling of zero-sized
types in struct fields, and matches the behavior of GCC and Clang for unions in
C when zero-sized types are allowed via their language extensions.