Skip to content

get_group(...) fails for groupby(...) based on a function #22257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Kimonode opened this issue Aug 9, 2018 · 4 comments · Fixed by #55926
Closed

get_group(...) fails for groupby(...) based on a function #22257

Kimonode opened this issue Aug 9, 2018 · 4 comments · Fixed by #55926
Assignees
Labels
Bug good first issue Groupby MultiIndex Needs Tests Unit test(s) needed to prevent regressions Regression Functionality that used to work in a prior pandas version

Comments

@Kimonode
Copy link

Kimonode commented Aug 9, 2018

Code Sample, a copy-pastable example if possible

"""
IMPORTANT: The issue happens with pandas==0.21.1 but does NOT happen with pandas==0.19
"""
# Define simple dataframe, from Pandas doc ;)
# http://pandas.pydata.org/pandas-docs/stable/groupby.html#groupby-sorting
df3 = pd.DataFrame({'X' : ['A', 'B', 'A', 'B'], 'Y' : [1, 4, 3, 2]})

# Define a simple function for grouping
def func(df, index):
    return (df['X'].loc[index], df['Y'].loc[index] % 2)

# Make sure the groupby worked, and check the group names
for name, df in df3.groupby(by=lambda index: func(df3, index), sort=False):
    print(name)
    print(df)

# Execute the code that triggers an error
df3.groupby(by=lambda index: func(df3, index), sort=False).get_group(('A', 1))

Problem description

The code raises: NotImplementedError: isna is not defined for MultiIndex
Note: That issue happens with pandas==0.21.1. It does NOT happen with pandas==0.19.

Expected Output

df3.groupby(...).get_group(('A', 1)) should not raise an error but return the group ('A', 1):

('A', 1)
   X  Y
0  A  1
2  A  3

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-30-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: fr_FR.UTF-8

pandas: 0.21.1
pytest: None
pip: 9.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.13.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@WillAyd
Copy link
Member

WillAyd commented Aug 9, 2018

The docs are rather ambiguous on MultiIndex support but I think this makes sense. Investigation and PRs are always welcome.

cc @toobaz

@WillAyd WillAyd added this to the Contributions Welcome milestone Aug 9, 2018
@toobaz toobaz added the Regression Functionality that used to work in a prior pandas version label Aug 10, 2018
toobaz added a commit to toobaz/pandas that referenced this issue Aug 11, 2018
toobaz added a commit to toobaz/pandas that referenced this issue Aug 11, 2018
toobaz added a commit to toobaz/pandas that referenced this issue Aug 12, 2018
@Kimonode
Copy link
Author

Thanks for the fix !

@mroeschke mroeschke added the Bug label May 11, 2020
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@rhshadrach rhshadrach added Needs Tests Unit test(s) needed to prevent regressions good first issue labels Jul 15, 2023
@rhshadrach
Copy link
Member

I'm seeing the correct behavior on main. I think the test from toobaz@9123f7d could be added to resolve this issue.

@kvn4
Copy link
Contributor

kvn4 commented Aug 17, 2023

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug good first issue Groupby MultiIndex Needs Tests Unit test(s) needed to prevent regressions Regression Functionality that used to work in a prior pandas version
Projects
None yet
6 participants