-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG - anchoring dates for resample with Day(n>1) #24159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
3a552a3
ecdd31c
be07f5b
9b63c38
2b7f935
4967c95
1d4ac73
37252a0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1463,3 +1463,24 @@ def f(data, add_arg): | |
result = df.groupby("A").resample("D").agg(f, multiplier) | ||
expected = df.groupby("A").resample('D').mean().multiply(multiplier) | ||
assert_frame_equal(result, expected) | ||
|
||
@pytest.mark.parametrize('n1, freq1, n2, freq2', [ | ||
(60, 'Min', 1, 'H'), | ||
(1440, 'Min', 1, 'D'), | ||
(24, 'H', 1, 'D'), | ||
(60, 'S', 1, 'Min'), | ||
(3600, 'S', 1, 'H'), | ||
(86400, 'S', 1, 'D') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you also add a frequency where the n=0.5? e.g. 0.5D and 12 H There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done. |
||
]) | ||
def test_resample_equivalent_offsets(self, n1, freq1, n2, freq2): | ||
for i in range(1, 2): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can parametrize on this too |
||
n1_ = n1 * i | ||
n2_ = n2 * i | ||
s = pd.Series(0, index=pd.date_range('19910905 13:00', | ||
'19911005 07:00', | ||
freq=freq1)) | ||
s = s + range(len(s)) | ||
|
||
result1 = s.resample(str(n1_) + freq1).mean() | ||
result2 = s.resample(str(n2_) + freq2).mean() | ||
assert_series_equal(result1, result2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did this test have to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current behavior for Day(n>1) is inconsistent with all other offsets and their multiples, in that it is not getting anchored on the offset when being resampled. This was due to the reversal of the modulo op (which was valid for n=1, but not otherwise).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the result/expected before and the result/expected after this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before:
After: both match expected from above
The correct analogous behavior with hourly resample can be seen with the current master code
Notice that the daterange didn't start on the offset, but the result was anchored to the hour start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks