Skip to content

Representation of fn pointers #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nikomatsakis opened this issue Aug 30, 2018 · 12 comments
Closed

Representation of fn pointers #14

nikomatsakis opened this issue Aug 30, 2018 · 12 comments
Assignees
Labels
A-layout Topic: Related to data structure layout (`#[repr]`) S-writeup-assigned Status: Ready for a writeup and someone is assigned to it

Comments

@nikomatsakis
Copy link
Contributor

Discussing the representation of extern "abi" fn(..) types:

  • What hazards exist if you try to transmute these to e.g. usize?
    • the C standard, for example, is conservative about the size of a data vs fn pointer
    • is this a concern on any modern architecture?
  • Related, is Option<extern "C" fn()> guaranteed to be equivalent to a "C fn pointer" representation?
    • (I think yes, and projects rely on this)
@nikomatsakis nikomatsakis added the A-layout Topic: Related to data structure layout (`#[repr]`) label Aug 30, 2018
@avadacatavra avadacatavra added A-layout Topic: Related to data structure layout (`#[repr]`) active discussion topic and removed A-layout Topic: Related to data structure layout (`#[repr]`) labels Aug 31, 2018
@alercah
Copy link

alercah commented Sep 8, 2018

C++ member function pointers (which allow for dynamic dispatch) are typically, but not necessarily, larger than data pointers. If Rust wished to support calling them directly by way of an extern "cpp-member" fn (say), then it would need larger pointers. I think this possibility (or the possibility of some other language doing similarly) is enough to say that the size of fn is therefore ABI-specific.

I don't know the background for the rules in C specifically that function pointers may be larger than data pointers to know if we can say that it's safe for the C ABI. However, presumably transmute would throw a compiler error if such a platform were ever introduced and erroneous code were used; it only be transmute_copy that could present a problem.

I agree that Option<extern "abi" fn()> should always optimize for the null pointer as None.

@hanna-kruppe
Copy link

Pointers to member functions are separate from regular function pointers in the C++ type system and I see no reason why we should pretend they're the same in Rust, especially if it requires weakening otherwise-plausible guarantees we could give.

A better motivation for making no guarantees about would be if some architectures had differently-sized address spaces for code and sizes, or if a platform ABI added extra metadata to function pointers that isn't there for other pointers, e.g., a tag to support enforcement of control flow integrity.

@alercah
Copy link

alercah commented Sep 9, 2018

They're definitely different, and they definitely do not use any of the existing ABIs. It's not the case that we support ABI polymorphism, though (except via Fn traits, but at that point we've already mostly stopped caring since an Fn object could have arbitrarily large size), which when I think about it makes it seems to make it a bit silly to insist that all fn pointers are necessarily convertible to usize: their use is likely going to be ABI-specific.

@hanna-kruppe
Copy link

hanna-kruppe commented Sep 9, 2018

Today, each function pointer refers to a specific free function that is declared with the same ABI string as the function pointer carries -- extern "foo" fn bar() {} can be referred to with an extern "foo" fn(). The ABI string, on functions as on function pointers, indicates how parameters and return values are passed, which registers get saved by whom, and other details of how calls and the function prologue and epilogue are codegen'd, but not how the function pointer is represented or where the function is placed in memory.

This means that, while we can't call any function with any ABI we like, it's not a total wild west either. In the past we have considered generating shims that adapt from one ABI to another (and have in fact done so in the past to codegen Rust functions with C ABI). Even exotic ABIs like ptx_kernel or msp430_interrupt are just selecting different codegen for functions and calls to them, not fundamentally changing what a function pointer means. This status quo does not necessarily have to prevail, and as I said I could see uses for extra data attached even to pointers to free functions (so I am not really arguing that fn pointers should be guaranteed to be laid out like usize), but today ABI strings cause only quite limited and well-understood variation.

A C++ pointer to member function, on the other hand is conceptually quite different from free functions and pointers to them. It's arguably even orthogonal to calling convention, since various compilers allow declaring member functions with different calling conventions (so e.g., you might have a member function that uses __fastcall).

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 9, 2018

Do extern "C" fn() and fn() have the same type?

@hanna-kruppe
Copy link

hanna-kruppe commented Sep 9, 2018

They are separate types. fn() is short for extern "Rust" fn() and fn pointers with different ABI strings are different types.

@nikomatsakis
Copy link
Contributor Author

I definitely think C++ member function pointers are out of scope for this discussion. Rust's function pointers are analogous to a C function pointer (eg., void (*)()) -- they don't carry any "extra data" (and they kind of can't, since they don't have a lifetime bound, for better or worse).

I believe that we should declare — at minimum — that an extern "C" fn() is represented in the same was as the corresponding C function pointer type (void (*)()), except that it cannot be NULL and must be valid to call (because safe code can call it).

This implies also that Option<extern "C" fn()> is fully representation compatible with void (*)().

I thenk plenty of unsafe code in the wild relies on this (as @wycats and @sgrif can probably attest; they happen to be two people who I've talked to about this in the past).

@sgrif
Copy link

sgrif commented Sep 10, 2018

I thenk plenty of unsafe code in the wild relies on this

It's what bindgen generates for anything that takes a function pointer as an argument, so I think that's reasonable. :) (I can say for sure that Diesel relies on Option<extern "C" fn(...) -> ...>'s representation)

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 10, 2018

(and they kind of can't, since they don't have a lifetime bound, for better or worse)

Why would they need a lifetime bound? IIRC they carry at most an offset into a vtable which does not depend on any object lifetimes.

(because safe code can call it).

How can I construct a extern "C" fn() that I can call in safe code ? AFAIK extern "C" fn() only accepts functions with extern "C" ABI. These functions are always unsafe, so one can't make a safe extern "C" fn() point to them (only extern "C" unsafe fn()).

@sfackler
Copy link
Member

How can I construct a extern "C" fn() that I can call in safe code ?

You just define it: https://play.rust-lang.org/?gist=23fca1fa4d23cb71489a1733d7e6de8b&version=stable&mode=debug&edition=2015

Bindings to external symbols are always unsafe functions since you're asserting you got the signature right.

@avadacatavra avadacatavra added the S-writeup-assigned Status: Ready for a writeup and someone is assigned to it label Sep 13, 2018
@nikomatsakis
Copy link
Contributor Author

I want to call out a comment by @rkruppe from the discussion about integer types:

A more general point regarding extremely niche implementation choices such as non-octet-bytes or NULL-at-nonzero-address: people are going to write code that relies on assumptions that are true on every platform they have ever heard of, and for good reason, as it simplifies their code at effectively no loss of portability. We can't prevent that, nor should we IMO, at most we could tell these people they are relying on implementation-defined behavior, which just makes it a de facto standard rather than a de jure one.

I find this very well put, and it I think definitely applies here, in terms of e.g. whether we commit to a extern "C" fn being compatible with a usize and so forth.

It seems like we ought to settle -- perhaps -- more generally on a policy in such cases. I feel like it's worth identifying a "default compatibility" profile that guarantees portability across all "major architectures", but perhaps identifying concerns that may apply to more esoteric architectures.

@nikomatsakis nikomatsakis added S-writeup-needed Status: Ready for a writeup and no one is assigned S-writeup-assigned Status: Ready for a writeup and someone is assigned to it and removed S-writeup-assigned Status: Ready for a writeup and someone is assigned to it S-writeup-needed Status: Ready for a writeup and no one is assigned labels Oct 11, 2018
@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-layout Topic: Related to data structure layout (`#[repr]`) S-writeup-assigned Status: Ready for a writeup and someone is assigned to it
Projects
None yet
Development

No branches or pull requests

9 participants