Skip to content

Inconsistent behavior in casting from float64 to uint32 depending on processors #16073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Terminus-IMRC opened this issue Apr 25, 2020 · 3 comments

Comments

@Terminus-IMRC
Copy link

Casting from float64 to uint32 behaves inconsistently depending on processors and bit sizes. The code I executed is:

import numpy as np
print([np.float64(_).astype(np.uint32) for _ in [1, 2, 3, 4, -1, -2, -3, -4]])
print(np.array([1, 2, 3, 4, -1, -2, -3, -4], dtype=np.float64).astype(np.uint32))

On Raspbian buster armhf (32-bit), Python 3.7.3, Numpy 1.18.1:

[1, 2, 3, 4, 0, 0, 0, 0]
[1 2 3 4 0 0 0 0]

On Debian sid arm64, Python 3.8.2, Numpy 1.18.3:

[1, 2, 3, 4, 0, 0, 0, 0]
[         1          2          3          4 4294967295 4294967294
 4294967293 4294967292]

On macOS (64-bit), Python 3.8.2, Numpy 1.18.2:

[1, 2, 3, 4, 4294967295, 4294967294, 4294967293, 4294967292]
[         1          2          3          4 4294967295 4294967294
 4294967293 4294967292]

The issues are:

  • The results above should be consistent between processors.
  • The results from np.float64(...).astype(np.uint32) and np.array(..., dtype=np.float64).astype(np.uint32) should be the same.

Are these expected? Thanks.

@matthew-brett
Copy link
Contributor

In general, I believe that Numpy defaults to C casting - though I couldn't immediately find a definitive document for this. Differences in C casting between platforms explains some differences.

#include <stdio.h>
#include <stdint.h>

int main(int argc, const char *argv[])
{
    double dinputs[] = {1, 2, 3, 4, -1, -2, -3, -4};
    double dval;
    int i;
    for (i=0; i < 8; i++) {
        dval = dinputs[i];
        printf("Double input %0.2f; UInt32 output %u\n", dval, (uint32_t) dval);
    }
    return 0;
}

On a 32-bit Raspbian:

Double input 1.00; UInt32 output 1
Double input 2.00; UInt32 output 2
Double input 3.00; UInt32 output 3
Double input 4.00; UInt32 output 4
Double input -1.00; UInt32 output 0
Double input -2.00; UInt32 output 0
Double input -3.00; UInt32 output 0
Double input -4.00; UInt32 output 0

On macOS:

Double input 1.00; UInt32 output 1
Double input 2.00; UInt32 output 2
Double input 3.00; UInt32 output 3
Double input 4.00; UInt32 output 4
Double input -1.00; UInt32 output 4294967295
Double input -2.00; UInt32 output 4294967294
Double input -3.00; UInt32 output 4294967293
Double input -4.00; UInt32 output 4294967292

I haven't got a 64-bit ARM to play with, and I can't explain the difference between np.float64(v).astype(np.uint32) and np.array([v], dtype=np.float64)).astype(np.uint32).

@Terminus-IMRC
Copy link
Author

Thank you for the response. Your code prints the same result as 32-bit Raspbian on AArch64.

I found that, to convert double to uint32, arm32 uses vcvt.u32.f64, aarch64 uses fcvtzu, and x86_64 uses cvttsd2si. The former two convert -1.0 to 0, while the last one converts -1.0 to 4294967295 (since the instruction does double-to-int32 conversion).

After googling, I found that the C specification says in §6.4.1.4:

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is (−1, Utype_MAX+1).

So both the conversion results are C11-specifically correct since the behavior when converting less-than-or-equal-to--1 float to unsigned integer is undefined.

Though the second issue is still weird to me, it can be said that the both issues are resolved in terms of the C specification, so closing now.

Thanks again for the response.

@Terminus-IMRC
Copy link
Author

§6.3.1.4 sigh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants