Skip to content

Allow explicit padding #2060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 16, 2021
Merged

Conversation

ericseppanen
Copy link
Contributor

If a struct needs to be serialized in its native format (padding bytes and all), for example writing it to a file or sending it on the network, then explicit padding fields are necessary, as anything reading the padding bytes of a struct may lead to Undefined Behavior.

This allows structs with padding to be compatible with zerocopy::AsBytes, as well as the more dangerous practice of transmute to a byte array.

Who would want such a crazy thing? Well, it turns out that there are a bunch of data structures in the Postgres code base that are written to files in their native memory format, non-portable padding and all. To be able to read and write those files we can use serde or zerocopy, but in either case we need the padding fields to be explicit.

This change is a bit more complicated than just "always turn on the padding bindgen could already compute" because previously, there was no code to generate padding fields at the end of a struct.

That leaves the question of what should happen in the cases where bindgen was emitting padding already-- is the missing tail padding a bug? I'm not sure how to trigger the existing "bindgen decided on its own to emit padding fields" mode, so I left that alone so far.

@highfive
Copy link

highfive commented Jun 3, 2021

warning Warning warning

  • These commits modify unsafe code. Please review it carefully!

Copy link
Contributor

@emilio emilio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible. Quite an obscure use-case but... :)

Let me know if you don't have cycles to address the nits below and I can address them myself.

Thanks!

@@ -236,11 +239,14 @@ impl<'a> StructLayoutTracker<'a> {
field_layout
);

let padding_align = if force_padding {
0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using an alignment of zero seems wrong, shouldn't it be one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry. Will fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking a bit more about it, shouldn't the padding alignment be the default if it's mandatory padding? Otherwise we might not fulfill the alignment requirements.

Copy link
Contributor Author

@ericseppanen ericseppanen Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I don't understand. The padding I see emitted (with my change enabled) is an array of u8s, e.g.

struct pad_me {
        uint8_t first;
        uint32_t second;
        uint16_t third;
};
pub struct pad_me {
    pub first: u8,
    pub __bindgen_padding_0: [u8; 3usize],
    pub second: u32,
    pub third: u16,
    pub __bindgen_padding_1: [u8; 2usize],
}

I don't understand why any __bindgen_padding_X field would ever have an alignment requirement-- that sounds like something that would re-introduce gaps between fields, which is exactly what I wanted to eliminate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be confused by the fact that I don't actually know what the existing padding code is for!

My desire is to get a struct that has no gaps-- every byte is accounted for by a named field. The previous code is kind of a mystery to me; I don't know what problem it's solving.

If a struct needs to be serialized in its native format (padding bytes
and all), for example writing it to a file or sending it on the network,
then explicit padding fields are necessary, as anything reading the
padding bytes of a struct may lead to Undefined Behavior.
Padding fields are an array of bytes; specify 1-byte alignment.
@ericseppanen
Copy link
Contributor Author

Test failures seem to be due to spurious network error crate download issues. Perhaps someone could nudge it to re-run the tests? I don't think I have the permissions to do so myself.

Copy link
Contributor

@emilio emilio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you, and sorry for the lag merging this. I plan to do a release with this change today/tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants