-
Notifications
You must be signed in to change notification settings - Fork 229
x86_64 memcmp may recurse infinitely without SSE #470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think it's fine to add the code that you suggest, with a clear comment that explains why it was done and that it depends on an internal implementation detail of LLVM. Fundamentally I don't think we have a way to guarantee that LLVM isn't calling the same builtin function recursively. |
This crate is marked with |
I think the culprit is rustc itself, since it explicitly emits a call to |
Add `fmodf128`
While looking at assembly in my kernel (which has SSE disabled) I noticed that there is an infinite recursion in
memcmp
if a string of 32 bytes or longer is passed:This seems to be caused by the comparison between
[u128; 2]
, which internally callscore::intrinsics::raw_eq
. The documentation states:Replacing it with
(u128, u128)
does not cause a call tomemcmp
to be emitted though it severely impacts performance (~2x slower):[u128; 2]
(u128, u128)
Leaving out
c32()
entirely has a lesser but still significant performance impact (~1.3x slower), though compares on smaller strings seem to benefit:No
c32()
Given that targets with SSE(2) do not seem to generate a call to
memcmp
even in debug mode it may be worth to keep usingc32()
if this feature is present. I don't know if it is worth the gamble though.The text was updated successfully, but these errors were encountered: