Skip to content

REGR: MultiIndex level names RuntimeError in groupby.apply #31068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Jan 16, 2020 · 0 comments · Fixed by #31133
Closed

REGR: MultiIndex level names RuntimeError in groupby.apply #31068

jorisvandenbossche opened this issue Jan 16, 2020 · 0 comments · Fixed by #31133
Labels
Groupby MultiIndex Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jan 16, 2020

df = pd.DataFrame({
    'A': np.arange(10), 'B': [1, 2] * 5, 
    'C': np.random.rand(10), 'D': np.random.rand(10)}
).set_index(['A', 'B'])  
df.groupby('B').apply(lambda x: x.sum())

On master this gives an error:

In [40]: df.groupby('B').apply(lambda x: x.sum())
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-40-75bc1ff12251> in <module>
----> 1 df.groupby('B').apply(lambda x: x.sum())

~/scipy/pandas/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs)
    733         with option_context("mode.chained_assignment", None):
    734             try:
--> 735                 result = self._python_apply_general(f)
    736             except TypeError:
    737                 # gh-20949

~/scipy/pandas/pandas/core/groupby/groupby.py in _python_apply_general(self, f)
    752 
    753         return self._wrap_applied_output(
--> 754             keys, values, not_indexed_same=mutated or self.mutated
    755         )
    756 

~/scipy/pandas/pandas/core/groupby/generic.py in _wrap_applied_output(self, keys, values, not_indexed_same)
   1200                 if len(keys) == ping.ngroups:
   1201                     key_index = ping.group_index
-> 1202                     key_index.name = key_names[0]
   1203 
   1204                     key_lookup = Index(keys)

~/scipy/pandas/pandas/core/indexes/base.py in name(self, value)
   1171             # Used in MultiIndex.levels to avoid silently ignoring name updates.
   1172             raise RuntimeError(
-> 1173                 "Cannot set name on a level of a MultiIndex. Use "
   1174                 "'MultiIndex.set_names' instead."
   1175             )

RuntimeError: Cannot set name on a level of a MultiIndex. Use 'MultiIndex.set_names' instead.

On 0.25.3 this works:

In [10]:  df.groupby('B').apply(lambda x: x.sum()) 
Out[10]: 
          C         D
B                    
1  2.761792  3.963817
2  1.040950  3.578762

It seems the additional MultiIndex level that is not used to group (['A', 'B'] are index levels, but only grouping by 'B').

@jorisvandenbossche jorisvandenbossche added Groupby Regression Functionality that used to work in a prior pandas version MultiIndex labels Jan 16, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.0.0 milestone Jan 16, 2020
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Jan 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby MultiIndex Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant