-
Notifications
You must be signed in to change notification settings - Fork 13.4k
rustc_session: be more precise about -Z plt=yes on x86-64? #141720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It shouldn't be. We enable RELRO, so we always do eager binding of all symbols, so using the PLT adds overhead for dynamically linked functions.
Yeah, |
i'd rephrased that along the way and effectively swapped yes/no so it was just backwards. sorry! what i meant to say is that on x86 i really cannot imagine a way in which the status quo is worse for dynamically linked functions, but it is always worse for statically linked functions. i've swapped the
i don't follow, wouldn't this only apply if the statically linked symbols were produced via also, #105518 looks like the default would be |
True you are free to use protected visibility in the asm code (please don't use hidden visibility. that breaks with dylibs as there is no guarantee that the rust caller ends up in the same dylib), but
Yes, though the same logic applies. Either hidden or protected visibility is enough for PLT relaxation to work. |
Prompted by this being mentioned elsewhere I did some of my own investigation. The visibility stuff seems reasonable, but why is relaxation not kicking in? It seems that rustc is not properly emitting relaxable ( Another thing already mentioned is LLVM trying to help by caching the address is unhelpful in the relaxable case. This also affects Clang. I don't know what we can do. |
@dramforever Because of broken linkers, see #115267. Possibly enough time has passed that enabling ELF relaxations would have less fallout now. Edit: Nope, two years later there is still no new cross release, so we're going to see exactly the same issues. |
Uh oh!
There was an error while loading. Please reload this page.
in #109982 rustc switched to
-Z plt=yes
on non-x86-64 platforms for a bunch of good reasons. and stuck with-Z plt=no
by default on x86-64 for also good reasons! unfortunately, defaulting to-Z plt=no
is a slight pessimization in programs heavily dependent on calls into statically linked libraries.PLT calls on x86 end up compiled to
e8 <addr>
calls, which at link time can be rewritten to direct calls to the callee, and presumably deletion of the GOT entry. when we skip the PLT on x86-64, it seems that linkers are unwilling to do a link-time optimization offf 15 <GOT addr>
into90 e8 <fn addr>
when the callee is local to the object, so an indirect call to the object-local persists*.i expect
-Z plt=no
to be better than-Z plt=yes
on x86-64 for all cases where the called functions are dynamically linked. i also expect-Z plt=no
to be worse than-Z plt=yes
on x86-64 for all cases where the called functions are statically linked and <4 GiB from their call sites. it'd be nice if we could skipnon_lazy_bind
if we know the called function is to be statically linked. if your compiled artifact is >4 GiB .. i've heard of such things but have no idea what's best :)"if we know the called function is to be statically linked" is the more annoying problem, though, because
rustc-link-lib
tells rustc only what libraries get what kind of linkage. especially on Unix-y platforms we don't know which of those platforms will provide a given symbol. the extern block can have a#[link(kind="static")]
attribute which i've used in this minimized example of the problem i'm talking about, which almost seems like enough information to choose when to do this optimization at codegen-time. unfortunately, if the source file says#[link(name="util", kind="static")] extern "C" { pub fn foo(); }
, and then you compile that source likerustc -l dylib=util ...
, the command-line parameter simply overrides the link attribute and you end up with a dynamic link tofoo
with the (in context) reasonableff 15 [GOT_entry]
call.because of the
#[link]
/-l KIND=NAME
interaction i'm really not sure what to do here. i was going to initially suggest plumbing#[link(kind="static")]
through to inform ifnonlazybind
is appropriate, but i had expected that conflicting link directives would at least produce an error. silently ending up with the command line argument is pretty unfortunate. does it seem reasonable to plumb the#[link]
attribute as a hint, advise#[link(kind="static")]
for statically linked functions, and make conflicting#[link]
and-l
arguments produce an error?memorysafety/rav1d#1417 is a more substantive case which motivates this issue, where hot code is a collection of assembly routines that are statically linked. i've written a longer analysis about the case in that issue, but it's just more supporting information around the observation above.
worse, for code that is hot around an indirect call to a constant target, branch prediction quite effectively hides the cost of this indirect call. if the hot code is more like a large region of warm code, the branch prediction can end up evicted and these indirect calls to a constant local function become quite costly.
worse (pt2), LLVM reasonably tries to improve the indirect call situation by hoisting loads to repeated calls of the same target, which can cause register pressure, additional spills, generally make this kind of unfortunate situation even worse.
The text was updated successfully, but these errors were encountered: