nlargest of multiindex grouper throw type error: nlargest() got an unexpected keyword argument 'axis' #21411

lf-shaw · 2018-06-10T16:21:00Z

Code Sample

import numpy as np
import pandas as pd

dts = pd.date_range('20180101', periods=10)
iterables = [dts, ['one', 'two']]
idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
s = pd.Series(np.random.randn(20), index=idx)
s.groupby('first').nlargest(1)

Problem description

After upgrade to Anaconda 5.2.0, the sample code raises the following exception

TypeError                                 Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    917             try:
--> 918                 result = self._python_apply_general(f)
    919             except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    935         keys, values, mutated = self.grouper.apply(f, self._selected_obj,
--> 936                                                    self.axis)
    937 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, f, data, axis)
   2272             group_axes = _get_axes(group)
-> 2273             res = f(group)
   2274             if not _is_indexed_like(res, group_axes):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in curried_with_axis(x)
    819             def curried_with_axis(x):
--> 820                 return f(x, *args, **kwargs_with_axis)
    821 

TypeError: nlargest() got an unexpected keyword argument 'axis'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    834             try:
--> 835                 return self.apply(curried_with_axis)
    836             except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
   3468     def apply(self, func, *args, **kwargs):
-> 3469         return super(SeriesGroupBy, self).apply(func, *args, **kwargs)
   3470 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    929                 with _group_selection_context(self):
--> 930                     return self._python_apply_general(f)
    931 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    935         keys, values, mutated = self.grouper.apply(f, self._selected_obj,
--> 936                                                    self.axis)
    937 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, f, data, axis)
   2272             group_axes = _get_axes(group)
-> 2273             res = f(group)
   2274             if not _is_indexed_like(res, group_axes):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in curried_with_axis(x)
    819             def curried_with_axis(x):
--> 820                 return f(x, *args, **kwargs_with_axis)
    821 

TypeError: nlargest() got an unexpected keyword argument 'axis'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    917             try:
--> 918                 result = self._python_apply_general(f)
    919             except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    940             values,
--> 941             not_indexed_same=mutated or self.mutated)
    942 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _wrap_applied_output(self, keys, values, not_indexed_same)
   3611             return self._concat_objects(keys, values,
-> 3612                                         not_indexed_same=not_indexed_same)
   3613         elif isinstance(values[0], DataFrame):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _concat_objects(self, keys, values, not_indexed_same)
   1135                                 levels=group_levels, names=group_names,
-> 1136                                 sort=False)
   1137             else:

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    224                        verify_integrity=verify_integrity,
--> 225                        copy=copy, sort=sort)
    226     return op.get_result()

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    377 
--> 378         self.new_axes = self._get_new_axes()
    379 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_new_axes(self)
    457 
--> 458         new_axes[self.axis] = self._get_concat_axis()
    459         return new_axes

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_concat_axis(self)
    513             concat_axis = _make_concat_multiindex(indexes, self.keys,
--> 514                                                   self.levels, self.names)
    515 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _make_concat_multiindex(indexes, keys, levels, names)
    594         return MultiIndex(levels=levels, labels=label_list, names=names,
--> 595                           verify_integrity=False)
    596 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in __new__(cls, levels, labels, sortorder, names, dtype, copy, name, verify_integrity, _set_identity)
    231             # handles name validation
--> 232             result._set_names(names)
    233 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in _set_names(self, names, level, validate)
    694                         'level {}, is already used for level '
--> 695                         '{}.'.format(name, l, used[name]))
    696 

ValueError: Duplicated level name: "first", assigned to level 1, is already used for level 0.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    837                 try:
--> 838                     return self.apply(curried)
    839                 except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
   3468     def apply(self, func, *args, **kwargs):
-> 3469         return super(SeriesGroupBy, self).apply(func, *args, **kwargs)
   3470 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    929                 with _group_selection_context(self):
--> 930                     return self._python_apply_general(f)
    931 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    940             values,
--> 941             not_indexed_same=mutated or self.mutated)
    942 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _wrap_applied_output(self, keys, values, not_indexed_same)
   3611             return self._concat_objects(keys, values,
-> 3612                                         not_indexed_same=not_indexed_same)
   3613         elif isinstance(values[0], DataFrame):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _concat_objects(self, keys, values, not_indexed_same)
   1135                                 levels=group_levels, names=group_names,
-> 1136                                 sort=False)
   1137             else:

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    224                        verify_integrity=verify_integrity,
--> 225                        copy=copy, sort=sort)
    226     return op.get_result()

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    377 
--> 378         self.new_axes = self._get_new_axes()
    379 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_new_axes(self)
    457 
--> 458         new_axes[self.axis] = self._get_concat_axis()
    459         return new_axes

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_concat_axis(self)
    513             concat_axis = _make_concat_multiindex(indexes, self.keys,
--> 514                                                   self.levels, self.names)
    515 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _make_concat_multiindex(indexes, keys, levels, names)
    594         return MultiIndex(levels=levels, labels=label_list, names=names,
--> 595                           verify_integrity=False)
    596 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in __new__(cls, levels, labels, sortorder, names, dtype, copy, name, verify_integrity, _set_identity)
    231             # handles name validation
--> 232             result._set_names(names)
    233 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in _set_names(self, names, level, validate)
    694                         'level {}, is already used for level '
--> 695                         '{}.'.format(name, l, used[name]))
    696 

ValueError: Duplicated level name: "first", assigned to level 1, is already used for level 0.

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    847                     try:
--> 848                         return self._aggregate_item_by_item(name,
    849                                                             *args, **kwargs)

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in __getattr__(self, attr)
    764         raise AttributeError("%r object has no attribute %r" %
--> 765                              (type(self).__name__, attr))
    766 

AttributeError: 'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-2-2e8f88f2b9dc> in <module>()
      6 idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
      7 s = pd.Series(np.random.randn(20), index=idx)
----> 8 s.groupby('first').nlargest(1)

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    849                                                             *args, **kwargs)
    850                     except (AttributeError):
--> 851                         raise ValueError
    852 
    853         return wrapper

ValueError:

Expected Output

sample output from pandas 0.22.0 (Anaconda 5.1.0)

first       first       second
2018-01-01  2018-01-01  two       0.488124
2018-01-02  2018-01-02  one       0.884829
2018-01-03  2018-01-03  one       0.476739
2018-01-04  2018-01-04  two      -0.375585
2018-01-05  2018-01-05  two       0.317667
2018-01-06  2018-01-06  two      -1.067373
2018-01-07  2018-01-07  two       1.135441
2018-01-08  2018-01-08  two       0.808737
2018-01-09  2018-01-09  one       1.440924
2018-01-10  2018-01-10  one      -0.204873
dtype: float64

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: zh_CN.UTF-8
LOCALE: None.None

pandas: 0.23.0
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: 0.8.1
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

gfyoung · 2018-06-10T18:14:13Z

cc @jreback

jbachlombardo · 2018-06-21T13:05:46Z

Have received identical as well -- worked using pandas 0.22 but with update am having this issue.

gfyoung · 2018-06-21T19:54:26Z

Given this report of a regression from 0.22.0 to 0.23.0, marking for 0.23.2.

cc @jreback @jorisvandenbossche

mroeschke · 2019-10-27T01:39:27Z

Looks to work on master. Could use a test.

In [203]: dts = pd.date_range('20180101', periods=10)
     ...: iterables = [dts, ['one', 'two']]
     ...: idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
     ...: s = pd.Series(np.random.randn(20), index=idx)
     ...: s.groupby('first').nlargest(1)
Out[203]:
first       first       second
2018-01-01  2018-01-01  one      -0.176182
2018-01-02  2018-01-02  two      -0.196403
2018-01-03  2018-01-03  two       1.908156
2018-01-04  2018-01-04  one      -0.685376
2018-01-05  2018-01-05  two       0.635422
2018-01-06  2018-01-06  one       1.137039
2018-01-07  2018-01-07  one       0.209114
2018-01-08  2018-01-08  one       0.067952
2018-01-09  2018-01-09  one      -1.112250
2018-01-10  2018-01-10  one       0.002893
dtype: float64

In [204]: pd.__version__
Out[204]: '0.26.0.dev0+682.g08ab156eb'

Closes pandas-dev#21411

Closes #21411

Closes pandas-dev#21411

gfyoung added the Groupby label Jun 10, 2018

gfyoung added the Regression Functionality that used to work in a prior pandas version label Jun 21, 2018

gfyoung added this to the 0.23.2 milestone Jun 21, 2018

jreback modified the milestones: 0.23.2, 0.23.3 Jun 26, 2018

jreback modified the milestones: 0.23.4, 0.23.5 Aug 2, 2018

jreback modified the milestones: 0.23.5, 0.24.0 Oct 23, 2018

jreback modified the milestones: 0.24.0, Contributions Welcome Dec 2, 2018

mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Groupby Regression Functionality that used to work in a prior pandas version labels Oct 27, 2019

gfyoung added a commit to forking-repos/pandas that referenced this issue Nov 5, 2019

TST: Test nLargest with MI grouper

e3661c7

Closes pandas-dev#21411

gfyoung mentioned this issue Nov 5, 2019

TST: Test nLargest with MI grouper #29423

Merged

gfyoung modified the milestones: Contributions Welcome, 1.0 Nov 5, 2019

gfyoung added a commit to forking-repos/pandas that referenced this issue Nov 5, 2019

TST: Test nLargest with MI grouper

ccb778b

Closes pandas-dev#21411

jreback closed this as completed in #29423 Nov 6, 2019

jreback pushed a commit that referenced this issue Nov 6, 2019

TST: Test nLargest with MI grouper (#29423)

c918f7e

Closes #21411

Reksbril pushed a commit to Reksbril/pandas that referenced this issue Nov 18, 2019

TST: Test nLargest with MI grouper (pandas-dev#29423)

4cebe4a

Closes pandas-dev#21411

proost pushed a commit to proost/pandas that referenced this issue Dec 19, 2019

TST: Test nLargest with MI grouper (pandas-dev#29423)

2e762dd

Closes pandas-dev#21411

proost pushed a commit to proost/pandas that referenced this issue Dec 19, 2019

TST: Test nLargest with MI grouper (pandas-dev#29423)

af3e493

Closes pandas-dev#21411

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nlargest of multiindex grouper throw type error: nlargest() got an unexpected keyword argument 'axis' #21411

nlargest of multiindex grouper throw type error: nlargest() got an unexpected keyword argument 'axis' #21411

lf-shaw commented Jun 10, 2018 •

edited

Loading

INSTALLED VERSIONS

gfyoung commented Jun 10, 2018

jbachlombardo commented Jun 21, 2018

gfyoung commented Jun 21, 2018

mroeschke commented Oct 27, 2019

nlargest of multiindex grouper throw type error: nlargest() got an unexpected keyword argument 'axis' #21411

nlargest of multiindex grouper throw type error: nlargest() got an unexpected keyword argument 'axis' #21411

Comments

lf-shaw commented Jun 10, 2018 • edited Loading

Code Sample

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

gfyoung commented Jun 10, 2018

jbachlombardo commented Jun 21, 2018

gfyoung commented Jun 21, 2018

mroeschke commented Oct 27, 2019

lf-shaw commented Jun 10, 2018 •

edited

Loading

Output of `pd.show_versions()`