-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: can't resample with non-nano dateindex, out-of-nanosecond-bounds #51274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/resample.py
Outdated
# TODO is there anything which can be reused here? | ||
freq_value = freq.nanos | ||
if unit == "us": | ||
freq_value = freq_value // 1_000 | ||
elif unit == "ms": | ||
freq_value = freq_value // 1_000_000 | ||
elif unit == "s": | ||
freq_value = freq_value // 1_000_000_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel is there any existing function which can be reused here?
there is periods_per_second
, but that takes npydatetime_unit
rather than str
(is there a function to convert between them?)
I didn't find one, but I'll look more carefully later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
abbrev_to_npy_unit
in tzconversion there are two places where we do approximately this, one of them has a comment to de-duplicate
pandas/core/resample.py
Outdated
offset = offset.as_unit("ns") | ||
offset = offset.as_unit(unit) | ||
|
||
freq_value = freq.nanos // ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be simpler to do Timedelta(freq).as_unit(unit)._value
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(i think freq.nanos is a pattern we need to move away from regardless since there is a risk of overflow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good one, thanks!
One comment on the non-test code, otherwise looks good. i dont know the resample code well enough to know on sight if the test "expected" is correct, but trust you |
Thanks for your review @jbrockmendel ! Regarding the expected value: it matches what pandas would currently give for nanosecond resolution: In [33]: idx = date_range("1983-01-01", "2000-01-01", freq="Y")
...: ser = Series([1, 4, 2, 8, 5, 7, 1, 4, 2, 8, 5, 7, 1, 4, 2, 8, 5], index=idx)
...: ser.resample("2Y").mean()
Out[33]:
1983-12-31 1.0
1985-12-31 3.0
1987-12-31 6.5
1989-12-31 4.0
1991-12-31 3.0
1993-12-31 6.5
1995-12-31 4.0
1997-12-31 3.0
1999-12-31 6.5
Freq: 2A-DEC, dtype: float64 , so if that's right, then this is right. OK to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @MarcoGorelli |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.