Skip to content

Use native scalar fma instruction #1267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 23, 2022
Merged

Conversation

afonso360
Copy link
Contributor

@afonso360 afonso360 commented Aug 21, 2022

Cranelift 0.87 now supports lowering fma as a libcall on x86.
With 0.88 enabling the native x86 instruction under the has_fma flag.

aarch64 and s390x already support this as a native instruction, so it's nice that we emit it for those.

We can't lower the SIMD version using the fma instruction since the lowering can fail if the x86 has_fma flag is not enabled. Cranelift doesn't yet know how to fallback for these cases.

We need to wait for the 0.87 release before merging this.

@afonso360 afonso360 force-pushed the native-fma branch 3 times, most recently from 04883d8 to eda2eb1 Compare August 21, 2022 17:20
@bjorn3
Copy link
Member

bjorn3 commented Aug 22, 2022

Updated to Cranelift 0.87.0 in bjorn3@b14c733. Can you please rebase?

Cranelift 0.87 now supports lowering `fma` as a libcall on x86 [0].
With 0.88 enabling the native x86 instruction under the `has_fma` flag.

aarch64 and s390x already support this as a native instruction, so it's
nice that we emit it for those.

We can't lower the SIMD version using the `fma` instruction since the
lowering can fail if the x86 `has_fma` flag is not enabled. Cranelift
doesn't yet know how to fallback for these cases

[0]: bytecodealliance/wasmtime@709716b
@afonso360 afonso360 marked this pull request as ready for review August 22, 2022 20:02
@bjorn3 bjorn3 merged commit 48c45c4 into rust-lang:master Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants