-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: df.groupby().resample()[[cols]] without key columns raise KeyError #47605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The reason why the current result only returns one column is pandas/pandas/core/resample.py Line 1266 in a9a496c
pandas/pandas/core/groupby/groupby.py Line 1662 in a9a496c
I think we should allow the aggregation on datetime-like columns when corresponding key(s) is/are included, but this will change the current groupby behavior (and I'm not sure which one is desired). Also this may be related to #47177 , not clear about the logic. >>> df = pd.DataFrame(
data={
"date": pd.date_range(start="2016-01-01", periods=8),
"group": [0, 0, 0, 0, 1, 1, 1, 1],
}
)
>>> df.groupby("group")[["date", "group"]].mean()
# original
group
group
0 0.0
1 1.0
# changed
date group
group
0 2016-01-02 12:00:00 0.0
1 2016-01-06 12:00:00 0.0 |
I find a simple solution is keep the resample pandas/pandas/core/resample.py Line 169 in 9612375
But this fails the following test:
I personally think the test is not reasonable since we do want to get all the valid aggreation column results, thus the expect should have shape of (6, 4) instead of (6, 3). And as mentioned in the above pr, the default behavior will be changed in the future. A little confused. Discussion needed. |
Since now we have DEPR the >>>df=pd.DataFrame({"a":[0,0,1,1], "b":pd.date_range("20200101", "20200104")})
>>>df.groupby("a").resample("D", on="b")[["a", "b"]].mean()
# should DEPR but not
a
a b
0 2020-01-01 0.0
2020-01-02 0.0
1 2020-01-03 1.0
2020-01-04 1.0
>>>df.groupby("a").resample("D", on="b")[["a", "b"]].mean(numeric_only=False)
# should include "b" column agg result but not
a
a b
0 2020-01-01 0.0
2020-01-02 0.0
1 2020-01-03 1.0
2020-01-04 1.0 |
@GYHHAHA #47605 (comment) is fine to do but another PR |
For the current BUG, I think the fix is enough. I can not add the test for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current fix looks reasonable cc @rhshadrach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks @GYHHAHA |
…or (pandas-dev#47605) * Update resample.py * Update v1.5.0.rst * Update test_resampler_grouper.py * delete blank * Update test_resampler_grouper.py * Update v1.5.0.rst * Update resample.py
df.groupby().resample()[[cols]]
without key columns raise KeyError #47362doc/source/whatsnew/v1.5.0.rst
file if fixing a bug or adding a new feature.