Skip to content

BUG: pandas.to_datetime fails to handle numpy.nan on riscv64 due to dependency on undefined behaviour #48610

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
andreas-schwab opened this issue Sep 17, 2022 · 27 comments · Fixed by #48705
Closed
2 of 3 tasks
Labels
Bug Datetime Datetime data dtype

Comments

@andreas-schwab
Copy link
Contributor

andreas-schwab commented Sep 17, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas
import numpy
print(pandas.to_datetime(numpy.nan, unit="s"))

Issue Description

Converting a value of floating type to integer type which is out of range for the integer type is undefined, see 6.3.1.4 Real floating and integer.

You can use gcc92.fsffrance.org for your tests if don't have your own hardware, or use qemu with the images from https://download.opensuse.org/ports/riscv/tumbleweed/images/.

$ python3 -c 'import pandas
import numpy
print(pandas.to_datetime(numpy.nan, unit="s"))'
Traceback (most recent call last):
  File "<string>", line 3, in <module>
  File "/usr/lib64/python3.10/site-packages/pandas/core/tools/datetimes.py", line 1078, in to_datetime
    result = convert_listlike(np.array([arg]), format)[0]
  File "/usr/lib64/python3.10/site-packages/pandas/core/tools/datetimes.py", line 357, in _convert_listlike_datetimes
    return _to_datetime_with_unit(arg, unit, name, tz, errors)
  File "/usr/lib64/python3.10/site-packages/pandas/core/tools/datetimes.py", line 530, in _to_datetime_with_unit
    arr, tz_parsed = tslib.array_with_unit_to_datetime(arg, unit, errors=errors)
  File "pandas/_libs/tslib.pyx", line 266, in pandas._libs.tslib.array_with_unit_to_datetime
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: cannot convert input with unit 's'

Expected Behavior

No error.

Installed Versions

/usr/lib/python3.10/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS

commit : e8093ba
python : 3.10.6.final.0
python-bits : 64
OS : Linux
OS-release : 6.0.0-rc5-38-default
Version : #1 SMP Mon Sep 12 15:18:20 UTC 2022 (005845a)
machine : riscv64
processor : riscv64
byteorder : little
LC_ALL : None
LANG : de_DE.UTF-8
LOCALE : de_DE.UTF-8

pandas : 1.4.3
numpy : 1.21.6
pytz : 2022.1
dateutil : 2.8.2
setuptools : 63.2.0
pip : 22.0.4
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : 1.3.5
brotli : 1.0.9
fastparquet : None
fsspec : None
gcsfs : None
markupsafe : None
matplotlib : None
numba : None
numexpr : 2.8.3
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : None
snappy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None

@andreas-schwab andreas-schwab added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 17, 2022
@phofl
Copy link
Member

phofl commented Sep 17, 2022

Hi, thanks for your report. This works on 1.4.3 and main for me. Can you please recheck, that you have pandas 1.4.3 installed?

@phofl phofl added Needs Info Clarification about behavior needed to assess issue and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 17, 2022
@andreas-schwab
Copy link
Contributor Author

Of course I have, why do you ask?

@phofl
Copy link
Member

phofl commented Sep 17, 2022

Because it works for me on 1.4.3, 1.4.4 and main

@andreas-schwab
Copy link
Contributor Author

How did you test that?

@phofl
Copy link
Member

phofl commented Sep 17, 2022

I executed the code snippet from your post?

@andreas-schwab
Copy link
Contributor Author

Where did you execute it? In qemu or on real hardware?

@phofl
Copy link
Member

phofl commented Sep 17, 2022

Macos

@phofl
Copy link
Member

phofl commented Sep 17, 2022

I executed it on a macOS os

@andreas-schwab
Copy link
Contributor Author

What hardware?

@andreas-schwab
Copy link
Contributor Author

There is macos for RISC-V???

@andreas-schwab
Copy link
Contributor Author

Did you actually read the report?

@andreas-schwab
Copy link
Contributor Author

Please use gcc92.fsffrance.org for your tests if don't have your own hardware.

@phofl
Copy link
Member

phofl commented Sep 17, 2022

There is a code snippet, nothing else. I read the versions, but if you think that this is specific to your hardware, a small explanation would have been nice. Your report reads like a general issue, which is not the case.

could you adjust your issue title and add a small explanation?

you could also try debugging it yourself if you are interested

@andreas-schwab
Copy link
Contributor Author

You can also use one of the images in https://download.opensuse.org/ports/riscv/tumbleweed/images/ with qemu.

@andreas-schwab
Copy link
Contributor Author

This is a general issue because it depends on undefined behaviour (converting NaN value to integer).

@phofl
Copy link
Member

phofl commented Sep 17, 2022

It works on my machine, so seems to be hardware dependent

@andreas-schwab
Copy link
Contributor Author

Depending on undefined behaviour is a bug.

@andreas-schwab andreas-schwab changed the title BUG: pandas.to_datetime fails to handle numpy.nan BUG: pandas.to_datetime fails to handle numpy.nan due to dependency on undefined behaviour Sep 17, 2022
@andreas-schwab
Copy link
Contributor Author

$ python3 -c $'import numpy\nprint(numpy.asarray(numpy.nan).astype("i8"))'
9223372036854775807

@jreback
Copy link
Contributor

jreback commented Sep 17, 2022

@andreas-schwab we don't support this hardware in any way

you can submit a patch if u can find the problem

@andreas-schwab
Copy link
Contributor Author

This has nothing to do with hardware support. This is undefined behaviour. Depending on undefined behaviour is a serious bug.

@phofl
Copy link
Member

phofl commented Sep 17, 2022

You are welcome to submit a pr, if you can identify the bug and provide a fix.

@MarcoGorelli
Copy link
Member

@andreas-schwab could you please clarify? What commits is this diff between, what's it meant to show? I've formatted your code to make it easier to read, but - apologies for not understanding - I still don't see your point. Could you clarify what exactly you're expecting pandas to do?

Did you actually read the report?

please be respectful

@andreas-schwab
Copy link
Contributor Author

andreas-schwab commented Sep 18, 2022 via email

@MarcoGorelli MarcoGorelli added the Closing Candidate May be closeable, needs more eyeballs label Sep 18, 2022
@phofl phofl changed the title BUG: pandas.to_datetime fails to handle numpy.nan due to dependency on undefined behaviour BUG: pandas.to_datetime fails to handle numpy.nan on riscv64 Sep 18, 2022
@phofl
Copy link
Member

phofl commented Sep 18, 2022

Could you please update the top post with steps how to reproduce the bug if you are on windows/ubuntu/macOS? This will help someone who wants to work on this. We have tests covering this case in the ci, so simply executing the code snippet won't be sufficient.

Additionally, it would be great if you could add an explanation on what you are referring to with undefined behaviour, this is not clear to me. Some context to the code snippet you posted earlier would also be helpful.

@andreas-schwab andreas-schwab changed the title BUG: pandas.to_datetime fails to handle numpy.nan on riscv64 BUG: pandas.to_datetime fails to handle numpy.nan on riscv64 due to dependency on undefined behaviour Sep 18, 2022
@andreas-schwab
Copy link
Contributor Author

Converting a value of floating type to integer type which is out of range for the integer type is undefined, see 6.3.1.4 Real floating and integer.

@andreas-schwab
Copy link
Contributor Author

You can use gcc92.fsffrance.org for your tests if don't have your own hardware, or use qemu with the images from https://download.opensuse.org/ports/riscv/tumbleweed/images/.

@phofl
Copy link
Member

phofl commented Sep 18, 2022

I copied it into the top post in case someone wants to work on this

@phofl phofl added Bug Datetime Datetime data dtype and removed Needs Info Clarification about behavior needed to assess issue Closing Candidate May be closeable, needs more eyeballs labels Sep 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants