-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
pandas.core.groupby.GroupBy.apply fails #20949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the bug report. |
Hmm interesting. FWIW when I remove numexpr I can't get this to run at all, regardless of whether or not I run another agg function first. |
Numexpr may be a red herring. From what I can tell the problem occurs at the following line of code: pandas/pandas/core/groupby/groupby.py Line 5063 in ef019fa
For agg functions like cc @jreback for any insight |
Here's another example that fails with 0.23rc2 (and in 0.22.0 as well), based on code from
However, if you do the following, it works:
So doing one operation (in this case |
@Dr-Irv seems related. Some code below illustrating what I think is going on: >>> grouped.apply(lambda x: x.iloc[0])[0] # KeyError as indicator
KeyError
>>> grouped._set_group_selection()
>>> grouped.apply(lambda x: x.iloc[0])[0] # Works now, as 'A' was not part of data
Timestamp('2016-01-01 12:00:00-0800', tz='US/Pacific')
>>> grouped._reset_group_selection() # Clear out the group selection
>>> grouped.apply(lambda x: x.iloc[0])[0] # Back to failing
KeyError Unfortunately just adding this call before |
this didn't work even in 0.20.3. not sure how we don't have a test for it though. |
@Dr-Irv your example is a separate issue. pls make a new report for that one. |
…pes and the user supplied function can fail on the grouping column closes pandas-dev#20949
…pes and the user supplied function can fail on the grouping column closes pandas-dev#20949
Code Sample:
Problem description
Applying a function to a grouped data frame fails. The code above is the example code from the official pandas documentation: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.apply.html
Output to the above code:
The error can be 'fixed' by applying another command to the grouped object first:
Expected Output
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-122-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.utf8
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.22.0
pytest: 2.8.7
pip: 9.0.1
setuptools: 20.7.0
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
pyarrow: None
xarray: None
IPython: 5.5.0
sphinx: None
patsy: 0.4.1
dateutil: 2.4.2
pytz: 2014.10
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.4.3
feather: None
matplotlib: 1.5.1
openpyxl: 2.3.0
xlrd: 0.9.4
xlwt: 0.7.5
xlsxwriter: None
lxml: 3.5.0
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.0.11
pymysql: 0.7.2.None
psycopg2: 2.6.1 (dt dec mx pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: