-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Make Timestamp implementation bounds match DTA/DTI/Series #39245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jbrockmendel
commented
Jan 17, 2021
•
edited by jreback
Loading
edited by jreback
- xref DatetimeIndex and Timestamp have different implementation limits #24124
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the example from the OP as well, the pd.date_range
looks like that example now fails for a different reason that will need to be fixed within pd.date_range |
ok will remove the closes then |
@jreback @jbrockmendel I think this change introduced a minor regression. A simple shell session to demonstrate: ➜ ~ cat tslib_bug_demo.py
import pandas as pd
min_timestamp = pd.Timestamp.min
min_pydatetime = min_timestamp.to_pydatetime()
min_datetime = pd.to_datetime(min_pydatetime)
print(min_datetime)
➜ ~ pyenv activate pandas_1_2_5
(pandas_1_2_5) ➜ ~ python --version
Python 3.9.10
(pandas_1_2_5) ➜ ~ pip list | grep pandas
pandas 1.2.5
(pandas_1_2_5) ➜ ~ python tslib_bug_demo.py
1677-09-21 00:12:43.145225
(pandas_1_2_5) ➜ ~ pyenv activate pandas_1_3_0
(pandas_1_3_0) ➜ ~ python --version
Python 3.9.10
(pandas_1_3_0) ➜ ~ pip list | grep pandas
pandas 1.3.0
(pandas_1_3_0) ➜ ~ python tslib_bug_demo.py
sys:1: UserWarning: Discarding nonzero nanoseconds in conversion
Traceback (most recent call last):
File "/Users/milotoor/.pyenv/versions/pandas_1_3_0/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2187, in objects_to_datetime64ns
values, tz_parsed = conversion.datetime_to_datetime64(data.ravel("K"))
File "pandas/_libs/tslibs/conversion.pyx", line 357, in pandas._libs.tslibs.conversion.datetime_to_datetime64
File "pandas/_libs/tslibs/np_datetime.pyx", line 120, in pandas._libs.tslibs.np_datetime.check_dts_bounds
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1677-09-21 00:12:43
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/milotoor/tslib_bug_demo.py", line 5, in <module>
min_datetime = pd.to_datetime(min_pydatetime)
File "/Users/milotoor/.pyenv/versions/pandas_1_3_0/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 914, in to_datetime
result = convert_listlike(np.array([arg]), format)[0]
File "/Users/milotoor/.pyenv/versions/pandas_1_3_0/lib/python3.9/site-packages/pandas/core/tools/datetimes.py", line 401, in _convert_listlike_datetimes
result, tz_parsed = objects_to_datetime64ns(
File "/Users/milotoor/.pyenv/versions/pandas_1_3_0/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2193, in objects_to_datetime64ns
raise err
File "/Users/milotoor/.pyenv/versions/pandas_1_3_0/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 2175, in objects_to_datetime64ns
result, tz_parsed = tslib.array_to_datetime(
File "pandas/_libs/tslib.pyx", line 379, in pandas._libs.tslib.array_to_datetime
File "pandas/_libs/tslib.pyx", line 606, in pandas._libs.tslib.array_to_datetime
File "pandas/_libs/tslib.pyx", line 602, in pandas._libs.tslib.array_to_datetime
File "pandas/_libs/tslib.pyx", line 474, in pandas._libs.tslib.array_to_datetime
File "pandas/_libs/tslibs/np_datetime.pyx", line 120, in pandas._libs.tslibs.np_datetime.check_dts_bounds
pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1677-09-21 00:12:43 Going from The ➜ pandas git:(main) git df v1.2.5 v1.3.0 -- pandas/_libs/tslibs/src
diff --git a/pandas/_libs/tslibs/src/datetime/np_datetime.c b/pandas/_libs/tslibs/src/datetime/np_datetime.c
index 8eb995dee6..9ad2ead5f9 100644
--- a/pandas/_libs/tslibs/src/datetime/np_datetime.c
+++ b/pandas/_libs/tslibs/src/datetime/np_datetime.c
@@ -32,7 +32,7 @@ This file is derived from NumPy 1.7. See NUMPY_LICENSE.txt
#endif // PyInt_AsLong
const npy_datetimestruct _NS_MIN_DTS = {
- 1677, 9, 21, 0, 12, 43, 145225, 0, 0};
+ 1677, 9, 21, 0, 12, 43, 145224, 193000, 0};
const npy_datetimestruct _NS_MAX_DTS = {
2262, 4, 11, 23, 47, 16, 854775, 807000, 0}; This feels like a bug to me, but perhaps it's expected behavior? I can confirm that v1.4.1 also raises the same exception. I'm happy to open an actual issue if you think this merits it. Your sagacious opinions would be greatly appreciated and thank you for your contributions to an amazing tool. |
It means that when doing .to_pydatetime, the nanosecond portion of the Timestamp is lost, so you should not expect to round-trip losslessly. The behavior you are describing is expected. |