-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix a bug in 'timedelta_range' that produced an extra point on a edge case (fix #30353) #33498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ee4c095
to
499b905
Compare
499b905
to
8820776
Compare
266499a
to
cb9c023
Compare
ffe1caf
to
89a9af5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the refactor looks reasonable - @jbrockmendel any thoughts?
A few comments on the refactor, but i really like the idea of re-using generate_regular_range |
- Bug in :func:`timedelta_range` that produced an extra point on a edge case (:issue:`30353`, :issue:`33498`) | ||
- Bug in :meth:`DataFrame.resample` that produced an extra point on a edge case (:issue:`30353`, :issue:`13022`, :issue:`33498`) | ||
- Bug in :meth:`DataFrame.resample` that ignored the ``loffset`` argument when dealing with timedelta (:issue:`7687`, :issue:`33498`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this is okay @WillAyd 🙂. I'll be happy to change this if you disagree with this changelog.
BTW, regarding loffset
, as mentioned earlier (https://github.com/pandas-dev/pandas/pull/33498/files#r407231964):
Useful because otherwise
tests/resample/test_base.py::test_resample_loffset_arg_type
was always failing.
However, this piece of code is questionable since #31809 will deprecateloffset
anyway.
if isinstance(expected.index, TimedeltaIndex): | ||
msg = "DataFrame are different" | ||
with pytest.raises(AssertionError, match=msg): | ||
tm.assert_frame_equal(result_agg, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is testing incorrect behavior? thats pretty weird
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel Yeah, that was the case. That was testing an incorrect behavior... 😕 And this was the case for two reasons:
loffset
support was missing in.resample
when dealing with timedelta. (loffset has no effect when passing in a numyp.timedelta64 #7687)- an extra bin (fixed with this PR) was present each time when dealing with timedelta. (BUG: resample with TimedeltaIndex, fenceposts are off #13022, Resampler sometimes adds extra bin with NaN #30353)
Here an example of the values when setting a breakpoint with PyCharm line 222 in pandas/tests/resample/test_base.py
on master:
The expected result (with loffset
and without an extra bin):
The actual result (without loffset
and with an extra bin):
I think that was the reason of the presence of this TODO:
pandas/pandas/tests/resample/test_base.py
Line 218 in 3d4f9dc
# GH 13022, 7687 - TODO: fix resample w/ TimedeltaIndex |
IMO, the presence of this if
was just to skip the tests of Timedeltas when resampling without skipping the tests of Timestamps and Periods parametrized with @all_ts
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm a couple of questions.
Thank you! 🙂 Let me know if you need anything else! |
a3b41bf
to
404ae11
Compare
404ae11
to
71438d6
Compare
thanks @hasB4K very nice! |
The issue from #30353 came actually from
timedelta_range
.The outputs from
mock_timedelta_range
and frompd.timedelta_range
are supposed to equivalent, but are not on pandas v1.0.0:Outputs without this PR:
Outputs with this PR:
It also solve an issue (that fail on master) related to this comment: #13022 (comment)
Outputs with this PR:
On the firsts commits I just fixed the issue in
_generate_regular_range
, then I decided to do a refactor and to use the same code that generate ranges incore/arrays/_ranges.py
fordate_range
andtimedelta_range
.Linked with #10887.
closes BUG: resample with TimedeltaIndex, fenceposts are off #13022
closes loffset has no effect when passing in a numyp.timedelta64 #7687 (for loffset when resampling Timedelta)
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff