-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
KeyError: 0 error on groupby apply #30731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
pls reformat the top to only include a minimal reproducible example your aggfunc is not defined if you have commentary in the causes then put in another comment or clearly delineate this from the top if it requires everyone reading the entire top section to grok then the likelihood of a response will be greatly decreased |
@venatir your code snippet works for me on pandas 0.25.3:
|
Closing as this works on v1.0.1 too, but please feel free to reopen if you have a failing reproducible example |
The example below reproduces the error in 0.25.3. The bug occurs when:
import pandas as pd
import datetime
def aggfunc(df):
return pd.Series([0.2, 0.2], index=[12, 13])
df=pd.DataFrame({
'a': datetime.datetime.today(),
'b': [1, 2],
'c': [5, 6],
})
df.drop(columns='a').groupby('b').apply(aggfunc) # works as expected
df.groupby('b').apply(aggfunc) # KeyError: 0 |
Code Sample, a copy-pastable example if possible
Looks like groupby.apply crashes when using datetime aggregation and returning non-datetime data.
The problem is here:
pandas.core.groupby.generic._recast_datetimelike_result
/pandas/core/groupby/generic.py:1857
E.g. My result columns are 12,13 and this is trying to iterate through the 0,1 which is the range.
The code in
/pandas/core/groupby/generic.py:1857
will fail with the above and an exception will be caught here:pandas/core/groupby/groupby.py:726
. because of gh-20949 it is trying again without the grouping key. It should have worked from the beggining and this exception is not there to catch this kind of error.The work around for this is to return a Series or DataFrame with the index reset, however this should not be a requirement.
The right way is to not use range in the
_recast_datetimelike_result
function.Thank you
Output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: