-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Add "algebraic" fast-math intrinsics, based on fast-math ops that cannot return poison #120718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -418,7 +418,11 @@ extern "C" LLVMAttributeRef LLVMRustCreateMemoryEffectsAttr(LLVMContextRef C, | |
} | ||
} | ||
|
||
// Enable a fast-math flag | ||
// Enable all fast-math flags, including those which will cause floating-point operations | ||
// to return poison for some well-defined inputs. This function can only be used to build | ||
// unsafe Rust intrinsics. That unsafety does permit additional optimizations, but at the | ||
// time of writing, their value is not well-understood relative to those enabled by | ||
// LLVMRustSetAlgebraicMath. | ||
// | ||
// https://llvm.org/docs/LangRef.html#fast-math-flags | ||
extern "C" void LLVMRustSetFastMath(LLVMValueRef V) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we still need the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd prefer to keep them for now and let people play with both variants. I hope that based on experience we can make an argument that the unsafety of the unsafe ones is not worth the optimizations that they unlock, but I have no data to back that up. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
|
@@ -427,6 +431,25 @@ extern "C" void LLVMRustSetFastMath(LLVMValueRef V) { | |
} | ||
} | ||
|
||
// Enable fast-math flags which permit algebraic transformations that are not allowed by | ||
// IEEE floating point. For example: | ||
// a + (b + c) = (a + b) + c | ||
// and | ||
// a / b = a * (1 / b) | ||
// Note that this does NOT enable any flags which can cause a floating-point operation on | ||
// well-defined inputs to return poison, and therefore this function can be used to build | ||
// safe Rust intrinsics (such as fadd_algebraic). | ||
// | ||
// https://llvm.org/docs/LangRef.html#fast-math-flags | ||
extern "C" void LLVMRustSetAlgebraicMath(LLVMValueRef V) { | ||
if (auto I = dyn_cast<Instruction>(unwrap<Value>(V))) { | ||
I->setHasAllowReassoc(true); | ||
I->setHasAllowContract(true); | ||
I->setHasAllowReciprocal(true); | ||
I->setHasNoSignedZeros(true); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about Does the word "algebraic" have a specific meaning here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Never mind, I see from other places that it does. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sure, why not. I'll add it.
I didn't want to just defang the existing intrinsics because there are some optimizations which rely on assuming that NaN/Inf do not occur. So I needed to come up with a new name, and the way I think about these intrinsics is that unlike IEEE float operations, they permit the usual algebraic transformations. Things like I don't think that the name is perfect, and I'd be happy to see someone suggest a better name, but @orlp seems perfectly happy calling them "algebraic". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That would be a great explanation to put in a comment somewhere :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Please don't. Replacing functions by their approximations isn't an algebraically justified optimization. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤔 And in any case, I don't think it would matter for these intrinsics. |
||
} | ||
} | ||
|
||
extern "C" LLVMValueRef | ||
LLVMRustBuildAtomicLoad(LLVMBuilderRef B, LLVMTypeRef Ty, LLVMValueRef Source, | ||
const char *Name, LLVMAtomicOrdering Order) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1882,6 +1882,46 @@ extern "rust-intrinsic" { | |
#[rustc_nounwind] | ||
pub fn frem_fast<T: Copy>(a: T, b: T) -> T; | ||
|
||
/// Float addition that allows optimizations based on algebraic rules. | ||
/// | ||
/// This intrinsic does not have a stable counterpart. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Which meaning of "stable" does this use? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The only way to call this intrinsic is to use the |
||
#[rustc_nounwind] | ||
#[rustc_safe_intrinsic] | ||
#[cfg(not(bootstrap))] | ||
pub fn fadd_algebraic<T: Copy>(a: T, b: T) -> T; | ||
|
||
/// Float subtraction that allows optimizations based on algebraic rules. | ||
/// | ||
/// This intrinsic does not have a stable counterpart. | ||
#[rustc_nounwind] | ||
#[rustc_safe_intrinsic] | ||
#[cfg(not(bootstrap))] | ||
pub fn fsub_algebraic<T: Copy>(a: T, b: T) -> T; | ||
|
||
/// Float multiplication that allows optimizations based on algebraic rules. | ||
/// | ||
/// This intrinsic does not have a stable counterpart. | ||
#[rustc_nounwind] | ||
#[rustc_safe_intrinsic] | ||
#[cfg(not(bootstrap))] | ||
pub fn fmul_algebraic<T: Copy>(a: T, b: T) -> T; | ||
|
||
/// Float division that allows optimizations based on algebraic rules. | ||
/// | ||
/// This intrinsic does not have a stable counterpart. | ||
#[rustc_nounwind] | ||
#[rustc_safe_intrinsic] | ||
#[cfg(not(bootstrap))] | ||
pub fn fdiv_algebraic<T: Copy>(a: T, b: T) -> T; | ||
|
||
/// Float remainder that allows optimizations based on algebraic rules. | ||
/// | ||
/// This intrinsic does not have a stable counterpart. | ||
#[rustc_nounwind] | ||
#[rustc_safe_intrinsic] | ||
#[cfg(not(bootstrap))] | ||
pub fn frem_algebraic<T: Copy>(a: T, b: T) -> T; | ||
|
||
/// Convert with LLVM’s fptoui/fptosi, which may return undef for values out of range | ||
/// (<https://github.com/rust-lang/rust/issues/10184>) | ||
/// | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
// compile-flags: -C opt-level=3 -C target-cpu=cannonlake | ||
// only-x86_64 | ||
|
||
// In a previous implementation, _mm512_reduce_add_pd did the reduction with all fast-math flags | ||
// enabled, making it UB to reduce a vector containing a NaN. | ||
|
||
#![crate_type = "lib"] | ||
#![feature(stdarch_x86_avx512, avx512_target_feature)] | ||
use std::arch::x86_64::*; | ||
|
||
// CHECK-label: @demo( | ||
#[no_mangle] | ||
#[target_feature(enable = "avx512f")] // Function-level target feature mismatches inhibit inlining | ||
pub unsafe fn demo() -> bool { | ||
// CHECK: %0 = tail call reassoc nsz arcp contract double @llvm.vector.reduce.fadd.v8f64( | ||
// CHECK: %_0.i = fcmp uno double %0, 0.000000e+00 | ||
// CHECK: ret i1 %_0.i | ||
let res = unsafe { | ||
_mm512_reduce_add_pd(_mm512_set_pd(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, f64::NAN)) | ||
}; | ||
res.is_nan() | ||
} |
Uh oh!
There was an error while loading. Please reload this page.