-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
resampling closed='left' incorrect ? #5440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you post the output you expect compared to what you actually get? Makes |
Might be worth looking at this issue as well: #5023 |
I think the expected behavior is:
Of course, in that case the default behavior should become See also my comment on issue #2665. |
This looks correct to me You have In [5]: ts1
Out[5]:
2000-03-29 1.0
2000-03-30 1.0
2000-03-31 1.0
2000-04-01 1.0
2000-04-02 1.0
2000-04-03 1.0 If you do
If you look at the values in each group you get:
As documented here, I think the OP's phrase
suggests they misunderstood what In [9]: ts1.resample('1MS', closed='left', label='left').sum()
Out[9]:
2000-03-01 3.0
2000-04-01 3.0
Freq: MS, dtype: float64 Closing then - please do let me know if I've misunderstood and I'll reopen |
I think you are technically speaking right that this is "correct", but it is also very confusing .. I suppose the main confusing part is that the month end/start labels are expressed in full days, and thus this is the start of the last day of the month. So when using closed="left", also the hours in the middle of the last day of a previous month are considered part of the next month, and not only the exact separating instant between both months (i.e. 2000-03-31 23:59:59.999.. / 2000-04-01 00:00:00). This contrasts to how for example the Day offset works, where the division between both periods is the exact instant. I think this is also basically what Stephan explains in #9586 Given the way that the MonthEnd frequency is defined as the start of the last day, I doubt that it ever makes sense to use this with closed="left". To avoid this confusion, maybe we could raise a warning/error when the user does this, pointing them to use |
Yes, I was thinking that - agree on raising (or at least warning), I'll open a new issue |
I think we can use #9586 for that, since it's basically closing the confusion that Stephan describes there? |
such old issues tend to not get any attention/activity, and the example doesn't run anymore as it's Python2 and a really old version of pandas furthermore, that issue sounds like it's more about renaming a confusing freq?
I've opened #51710 as warning/erroring is a different activity I'll have a think, but I agree something should be done as this is really confusing |
Hi,
Coming back to issue #2665, I'm still not convinced that the closed-argument of the resampling function does what it should.
When resampling daily to monthly values with closed = 'left', the closed argument should relate to the actual time the months change, but instead it seems to refer to the time 24 hours earlier.
The text was updated successfully, but these errors were encountered: