Skip to content

Add x86_64 builtins #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 19, 2016
Merged

Add x86_64 builtins #40

merged 4 commits into from
Aug 19, 2016

Conversation

mattico
Copy link
Contributor

@mattico mattico commented Aug 17, 2016

This is a fairly straightforward port of the original assembly files. I'm hoping to help port a whole bunch of these builtins over the next few days, so let me know if you have any style or other concerns. I don't believe that these functions have any tests in compiler-rt and they would be quite difficult functions to test. Nonetheless, I'm going to attempt to find a windows binary that exercises these and swap them in to see if anything breaks.

Two notes:

  • __alloca falls through to ___chkstk in the original assembly. I think having a function call there is the best way to implement this reliably, other than duplicating the assembly.
  • The original assembly files don't appear to be conditionally compiled, but are noted as only being used on Windows, so I am conditionally compiling them now.

// I don't believe that this behavior is guaranteed, and a program that uses
// only __alloca could have ___chkstk removed by --gc-sections. Call
// ___chkstk here to guarantee that neither of those happen.
___chkstk();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since __chkstk uses a custom calling convention, you should probably jump to it using a jmp instruction instead of calling it from Rust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had thought about that, but couldn't __chkstk possibly be removed by the linker in that case? Now that I think about it again, the linker could probably see the use of the ___chkstk symbol in the jmp and leave it in, just the same as it would for a call instruction.

@japaric
Copy link
Member

japaric commented Aug 17, 2016

I'm hoping to help port a whole bunch of these builtins over the next few days

👍 Awesome.

I don't believe that these functions have any tests in compiler-rt and they would be quite difficult functions to test.

A good sanity check is to manually check that the output assembly of this crate is similar/equal to the disassemble of the libcompiler-rt.a shipped with Rust.

$ objdump -Cd $(rustc --print sysroot)/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcompiler-rt.a

(I'm not sure what's the Windows' objdump equivalent -- dumpbin maybe?)

__alloca falls through to ___chkstk in the original assembly. I think having a function call there is the best way to implement this reliably, other than duplicating the assembly.

This what the disassembly of Rust's libcompiler-rt.a looks like on my Linux machine:

0000000000000000 <__alloca>:
   0:   48 89 c8                mov    %rcx,%rax

0000000000000003 <___chkstk>:
   3:   51                      push   %rcx
   4:   48 3d 00 10 00 00       cmp    $0x1000,%rax
   a:   48 8d 4c 24 10          lea    0x10(%rsp),%rcx
   f:   72 18                   jb     29 <___chkstk+0x26>
  11:   48 81 e9 00 10 00 00    sub    $0x1000,%rcx
  18:   48 85 09                test   %rcx,(%rcx)
  1b:   48 2d 00 10 00 00       sub    $0x1000,%rax
  21:   48 3d 00 10 00 00       cmp    $0x1000,%rax
  27:   77 e8                   ja     11 <___chkstk+0xe>
  29:   48 29 c1                sub    %rax,%rcx
  2c:   48 85 09                test   %rcx,(%rcx)
  2f:   48 8d 44 24 08          lea    0x8(%rsp),%rax
  34:   48 89 cc                mov    %rcx,%rsp
  37:   48 8b 48 f8             mov    -0x8(%rax),%rcx
  3b:   ff 30                   pushq  (%rax)
  3d:   48 29 e0                sub    %rsp,%rax
  40:   c3                      retq

I'm not sure if a function call would generate equivalent assembly. I can't really read x86 assembly but I think that __alloca is not supposed to trigger an extra function call over __chkstk.

The original assembly files don't appear to be conditionally compiled, but are noted as only being used on Windows, so I am conditionally compiling them now.

Makes sense to me.

@mattico
Copy link
Contributor Author

mattico commented Aug 17, 2016

The generated assembly looks much better without the function call, and I've convinced myself that --gc-sections shouldn't be a problem. The assembly is now identical to compiler-rt.a with the exception of the extra jmp instruction and some ud2 padding.

@mattico
Copy link
Contributor Author

mattico commented Aug 17, 2016

note: rustc_builtins-385feb8268bd6d9b.0.o : error LNK2019: unresolved external symbol ___chkstk referenced in function _ZN14rustc_builtins6x86_648__alloca17h4178164e29c9a080E

The appveyor build failed because the test build has name mangling so ___chkstk is undefined.

@Amanieu
Copy link
Member

Amanieu commented Aug 17, 2016

The problem here is that we only specify no_mangle for non-test configurations. Maybe we should make it unconditional for ___chkstk in this case (with a comment explaining why).

This prevents linker errors in test builds due to the `jmp` instruction in __alloca
@mattico
Copy link
Contributor Author

mattico commented Aug 17, 2016

We can probably make all of them unconditionally no_mangle since we're not testing these functions, and if we were testing them we'd want to test all of them anyway.

@japaric japaric merged commit 74cd512 into rust-lang:master Aug 19, 2016
@japaric
Copy link
Member

japaric commented Aug 19, 2016

Thank you @mattico!

@mattico mattico deleted the add-x86_64 branch August 20, 2016 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants