Skip to content

Wrapped statics should generate more portable code #2445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
davidben opened this issue Mar 16, 2023 · 6 comments · Fixed by #2448
Closed

Wrapped statics should generate more portable code #2445

davidben opened this issue Mar 16, 2023 · 6 comments · Fixed by #2448

Comments

@davidben
Copy link

davidben commented Mar 16, 2023

Input C/C++ Header

static inline void inline_func(void) {}
void not_inline_func(void);

Bindgen Invocation

$ bindgen header.h --experimental --wrap-static-fns --wrap-static-fns-path=wrapper.c

Actual Results

/* automatically generated by rust-bindgen 0.64.0 */

extern "C" {
    #[link_name = "\u{1}inline_func__extern"]
    pub fn inline_func();
}
extern "C" {
    pub fn not_inline_func();
}
void inline_func__extern(void) asm("inline_func__extern");
void inline_func__extern() { return inline_func(); }

Expected Results

The output uses GCC asm labels and is missing an include of header.h. This seems to make it needlessly difficult to integrate into a project:

  • Lacking the include, projects need to use a compiler flag like -include to forcibly include the header. This isn't a standard flag and won't work if, say, building with MSVC.
  • Using the GCC-specific asm label syntax to change the symbol bypasses the platform's underlying name mangling mechanism (e.g. C symbols are underscore-prefixed on darwin), and also breaks if building with a non-GCC-compatible compiler.

Is there a reason the feature was designed this way?

Adding the include seems a straightforward way to stick to standard C mechanisms.

The non-inline function doesn't require any decoration, so clearly bindgen and rustc are perfectly capable of handling the normal C mangling. If the aim was just so the Rust symbol appears as inline_func instead of inline_func__extern, #1063 suggests that, by default, link_name still does the default C mangling.

@pvdrz
Copy link
Contributor

pvdrz commented Mar 16, 2023

Lacking the include, projects need to use a compiler flag like -include to forcibly include the header. This isn't a standard flag and won't work if, say, building with MSVC.

I don't see why can't we use #include instead other than avoiding relative path issues when compiling the project as a whole. E.G the original headers are in one directory and the generated c code is in other so just #include "path_to_original_headers" won't work. Maybe we could canonicalize the path but I'm not sure if that's a good idea.

Using the GCC-specific asm label syntax to change the symbol bypasses the platform's underlying name mangling mechanism (e.g. C symbols are underscore-prefixed on darwin), and also breaks if building with a non-GCC-compatible compiler.

So yes, this was done because mangling on macos was messing up the linking because a _ was being prefixed to every single symbol.

I don't have anything against removing the labeling syntax if we can make this work in linux and macos. Maybe we can just ask clang to mangle the name before generating the code or something. This issue doesn't happen on linux and I don't have any apple hardware so testing will be slow.

About portability, well I mean, bindgen just uses libclang for this so in principle even using GCC could cause you trouble down the road. However, I agree we should be as portable as reasonably possible.

The good news is that the feature is experimental so we don't need to be strict about breaking changes, What do you think?

@pvdrz
Copy link
Contributor

pvdrz commented Mar 16, 2023

Regarding #include vs -include the other issue is that you can have more than a single header input so we would have to add them all. Additionally the user is allowed to pass raw C code as input too so that would be something else to consider and document.

@alex
Copy link
Member

alex commented Mar 16, 2023

I'm not sure how most folks are using bindgen, but at least the way I've used it in the linux kernel and also boringssl, we have a single .h file that includes all the other .h files we want to bind, so simply #including that file would be the way to go.

@davidben
Copy link
Author

davidben commented Mar 16, 2023

Additionally the user is allowed to pass raw C code as input too so that would be something else to consider and document.

I would assume the generated C file, in that mode, just don't work at all because the generated file would have no way to get at the inline functions. Though I suppose bindgen could just paste the raw C code in front? But I don't know how people who do it that way use it, so I'm not sure.

I don't have anything against removing the labeling syntax if we can make this work in linux and macos. Maybe we can just ask clang to mangle the name before generating the code or something. This issue doesn't happen on linux and I don't have any apple hardware so testing will be slow.

I was actually thinking something simpler. Any reason not to just do it like this?

extern "C" {
    #[link_name = "inline_func__extern"]
    pub fn inline_func();
}
extern "C" {
    pub fn not_inline_func();
}
void inline_func__extern(void);
void inline_func__extern(void) { return inline_func(); }

That is, based on #1063, it sounds like removing the \u{1} causes Rust to let the name get mangled in the platform way.

@pvdrz
Copy link
Contributor

pvdrz commented Mar 17, 2023

I'm not sure how most folks are using bindgen, but at least the way I've used it in the linux kernel and also boringssl, we have a single .h file that includes all the other .h files we want to bind, so simply #including that file would be the way to go.

Yeah I think that for 99.9% of users that's the case. Happy to see you in the discussion as you're the only person I know of using this feature in a real life scenario :P

I would assume the generated C file, in that mode, just don't work at all because the generated file would have no way to get at the inline functions. Though I suppose bindgen could just paste the raw C code in front? But I don't know how people who do it that way use it, so I'm not sure.

Injecting that code is definitely something we could do without issue. I think that little issue is solved. I'll start working on #includeing the headers then.

I was actually thinking something simpler. Any reason not to just do it like this?

Yeah we could "enable" mangling for static function wrappers then.

It seems we have a plan! Thanks you two for your opinions. See you on review/testing/release :P

Let me know if anything else comes up as this is code I'm very familiar so we can iterate over it quickly.

@pvdrz
Copy link
Contributor

pvdrz commented Mar 17, 2023

Well... integration tests are passing 🎉

Let me know if you can compile your projects with these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants