Skip to content

nlargest of multiindex grouper throw type error: nlargest() got an unexpected keyword argument 'axis' #21411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lf-shaw opened this issue Jun 10, 2018 · 4 comments · Fixed by #29423
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@lf-shaw
Copy link

lf-shaw commented Jun 10, 2018

Code Sample

import numpy as np
import pandas as pd

dts = pd.date_range('20180101', periods=10)
iterables = [dts, ['one', 'two']]
idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
s = pd.Series(np.random.randn(20), index=idx)
s.groupby('first').nlargest(1)

Problem description

After upgrade to Anaconda 5.2.0, the sample code raises the following exception

TypeError                                 Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    917             try:
--> 918                 result = self._python_apply_general(f)
    919             except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    935         keys, values, mutated = self.grouper.apply(f, self._selected_obj,
--> 936                                                    self.axis)
    937 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, f, data, axis)
   2272             group_axes = _get_axes(group)
-> 2273             res = f(group)
   2274             if not _is_indexed_like(res, group_axes):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in curried_with_axis(x)
    819             def curried_with_axis(x):
--> 820                 return f(x, *args, **kwargs_with_axis)
    821 

TypeError: nlargest() got an unexpected keyword argument 'axis'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    834             try:
--> 835                 return self.apply(curried_with_axis)
    836             except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
   3468     def apply(self, func, *args, **kwargs):
-> 3469         return super(SeriesGroupBy, self).apply(func, *args, **kwargs)
   3470 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    929                 with _group_selection_context(self):
--> 930                     return self._python_apply_general(f)
    931 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    935         keys, values, mutated = self.grouper.apply(f, self._selected_obj,
--> 936                                                    self.axis)
    937 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, f, data, axis)
   2272             group_axes = _get_axes(group)
-> 2273             res = f(group)
   2274             if not _is_indexed_like(res, group_axes):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in curried_with_axis(x)
    819             def curried_with_axis(x):
--> 820                 return f(x, *args, **kwargs_with_axis)
    821 

TypeError: nlargest() got an unexpected keyword argument 'axis'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    917             try:
--> 918                 result = self._python_apply_general(f)
    919             except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    940             values,
--> 941             not_indexed_same=mutated or self.mutated)
    942 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _wrap_applied_output(self, keys, values, not_indexed_same)
   3611             return self._concat_objects(keys, values,
-> 3612                                         not_indexed_same=not_indexed_same)
   3613         elif isinstance(values[0], DataFrame):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _concat_objects(self, keys, values, not_indexed_same)
   1135                                 levels=group_levels, names=group_names,
-> 1136                                 sort=False)
   1137             else:

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    224                        verify_integrity=verify_integrity,
--> 225                        copy=copy, sort=sort)
    226     return op.get_result()

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    377 
--> 378         self.new_axes = self._get_new_axes()
    379 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_new_axes(self)
    457 
--> 458         new_axes[self.axis] = self._get_concat_axis()
    459         return new_axes

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_concat_axis(self)
    513             concat_axis = _make_concat_multiindex(indexes, self.keys,
--> 514                                                   self.levels, self.names)
    515 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _make_concat_multiindex(indexes, keys, levels, names)
    594         return MultiIndex(levels=levels, labels=label_list, names=names,
--> 595                           verify_integrity=False)
    596 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in __new__(cls, levels, labels, sortorder, names, dtype, copy, name, verify_integrity, _set_identity)
    231             # handles name validation
--> 232             result._set_names(names)
    233 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in _set_names(self, names, level, validate)
    694                         'level {}, is already used for level '
--> 695                         '{}.'.format(name, l, used[name]))
    696 

ValueError: Duplicated level name: "first", assigned to level 1, is already used for level 0.

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    837                 try:
--> 838                     return self.apply(curried)
    839                 except Exception:

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
   3468     def apply(self, func, *args, **kwargs):
-> 3469         return super(SeriesGroupBy, self).apply(func, *args, **kwargs)
   3470 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
    929                 with _group_selection_context(self):
--> 930                     return self._python_apply_general(f)
    931 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
    940             values,
--> 941             not_indexed_same=mutated or self.mutated)
    942 

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _wrap_applied_output(self, keys, values, not_indexed_same)
   3611             return self._concat_objects(keys, values,
-> 3612                                         not_indexed_same=not_indexed_same)
   3613         elif isinstance(values[0], DataFrame):

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _concat_objects(self, keys, values, not_indexed_same)
   1135                                 levels=group_levels, names=group_names,
-> 1136                                 sort=False)
   1137             else:

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    224                        verify_integrity=verify_integrity,
--> 225                        copy=copy, sort=sort)
    226     return op.get_result()

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in __init__(self, objs, axis, join, join_axes, keys, levels, names, ignore_index, verify_integrity, copy, sort)
    377 
--> 378         self.new_axes = self._get_new_axes()
    379 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_new_axes(self)
    457 
--> 458         new_axes[self.axis] = self._get_concat_axis()
    459         return new_axes

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _get_concat_axis(self)
    513             concat_axis = _make_concat_multiindex(indexes, self.keys,
--> 514                                                   self.levels, self.names)
    515 

~\Programs\Anaconda3\lib\site-packages\pandas\core\reshape\concat.py in _make_concat_multiindex(indexes, keys, levels, names)
    594         return MultiIndex(levels=levels, labels=label_list, names=names,
--> 595                           verify_integrity=False)
    596 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in __new__(cls, levels, labels, sortorder, names, dtype, copy, name, verify_integrity, _set_identity)
    231             # handles name validation
--> 232             result._set_names(names)
    233 

~\Programs\Anaconda3\lib\site-packages\pandas\core\indexes\multi.py in _set_names(self, names, level, validate)
    694                         'level {}, is already used for level '
--> 695                         '{}.'.format(name, l, used[name]))
    696 

ValueError: Duplicated level name: "first", assigned to level 1, is already used for level 0.

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    847                     try:
--> 848                         return self._aggregate_item_by_item(name,
    849                                                             *args, **kwargs)

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in __getattr__(self, attr)
    764         raise AttributeError("%r object has no attribute %r" %
--> 765                              (type(self).__name__, attr))
    766 

AttributeError: 'SeriesGroupBy' object has no attribute '_aggregate_item_by_item'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-2-2e8f88f2b9dc> in <module>()
      6 idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
      7 s = pd.Series(np.random.randn(20), index=idx)
----> 8 s.groupby('first').nlargest(1)

~\Programs\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in wrapper(*args, **kwargs)
    849                                                             *args, **kwargs)
    850                     except (AttributeError):
--> 851                         raise ValueError
    852 
    853         return wrapper

ValueError: 

Expected Output

sample output from pandas 0.22.0 (Anaconda 5.1.0)

first       first       second
2018-01-01  2018-01-01  two       0.488124
2018-01-02  2018-01-02  one       0.884829
2018-01-03  2018-01-03  one       0.476739
2018-01-04  2018-01-04  two      -0.375585
2018-01-05  2018-01-05  two       0.317667
2018-01-06  2018-01-06  two      -1.067373
2018-01-07  2018-01-07  two       1.135441
2018-01-08  2018-01-08  two       0.808737
2018-01-09  2018-01-09  one       1.440924
2018-01-10  2018-01-10  one      -0.204873
dtype: float64

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: zh_CN.UTF-8
LOCALE: None.None

pandas: 0.23.0
pytest: 3.5.1
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: 0.8.1
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@gfyoung
Copy link
Member

gfyoung commented Jun 10, 2018

cc @jreback

@jbachlombardo
Copy link

Have received identical as well -- worked using pandas 0.22 but with update am having this issue.

@gfyoung gfyoung added the Regression Functionality that used to work in a prior pandas version label Jun 21, 2018
@gfyoung gfyoung added this to the 0.23.2 milestone Jun 21, 2018
@gfyoung
Copy link
Member

gfyoung commented Jun 21, 2018

Given this report of a regression from 0.22.0 to 0.23.0, marking for 0.23.2.

cc @jreback @jorisvandenbossche

@jreback jreback modified the milestones: 0.23.2, 0.23.3 Jun 26, 2018
@jreback jreback modified the milestones: 0.23.4, 0.23.5 Aug 2, 2018
@jreback jreback modified the milestones: 0.23.5, 0.24.0 Oct 23, 2018
@jreback jreback modified the milestones: 0.24.0, Contributions Welcome Dec 2, 2018
@mroeschke
Copy link
Member

Looks to work on master. Could use a test.

In [203]: dts = pd.date_range('20180101', periods=10)
     ...: iterables = [dts, ['one', 'two']]
     ...: idx = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
     ...: s = pd.Series(np.random.randn(20), index=idx)
     ...: s.groupby('first').nlargest(1)
Out[203]:
first       first       second
2018-01-01  2018-01-01  one      -0.176182
2018-01-02  2018-01-02  two      -0.196403
2018-01-03  2018-01-03  two       1.908156
2018-01-04  2018-01-04  one      -0.685376
2018-01-05  2018-01-05  two       0.635422
2018-01-06  2018-01-06  one       1.137039
2018-01-07  2018-01-07  one       0.209114
2018-01-08  2018-01-08  one       0.067952
2018-01-09  2018-01-09  one      -1.112250
2018-01-10  2018-01-10  one       0.002893
dtype: float64

In [204]: pd.__version__
Out[204]: '0.26.0.dev0+682.g08ab156eb'

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Groupby Regression Functionality that used to work in a prior pandas version labels Oct 27, 2019
gfyoung added a commit to forking-repos/pandas that referenced this issue Nov 5, 2019
@gfyoung gfyoung modified the milestones: Contributions Welcome, 1.0 Nov 5, 2019
gfyoung added a commit to forking-repos/pandas that referenced this issue Nov 5, 2019
jreback pushed a commit that referenced this issue Nov 6, 2019
Reksbril pushed a commit to Reksbril/pandas that referenced this issue Nov 18, 2019
proost pushed a commit to proost/pandas that referenced this issue Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this issue Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants