Skip to content

GroupBy Regression with Categorical On Master #29746

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
WillAyd opened this issue Nov 20, 2019 · 3 comments
Closed

GroupBy Regression with Categorical On Master #29746

WillAyd opened this issue Nov 20, 2019 · 3 comments
Labels
Categorical Categorical Data Type Groupby Regression Functionality that used to work in a prior pandas version

Comments

@WillAyd
Copy link
Member

WillAyd commented Nov 20, 2019

Seems to be an issue on master as this works on 0.25.3:

>>> ser = pd.Series(pd.Categorical(["first", "second", "third", "fourth"], ordered=True))
>>> ser.groupby([1, 1, 1, 1]).first()
[first]
Categories (4, object): [first < fourth < second < third]

But fails on master:

>>> ser = pd.Series(pd.Categorical(["first", "second", "third", "fourth"], ordered=True))
>>> ser.groupby([1, 1, 1, 1]).first()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/groupby.py", line 1368, in f
    return self._cython_agg_general(alias, alt=npfunc, **kwargs)
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/groupby.py", line 880, in _cython_agg_general
    obj._values, how, min_count=min_count
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/ops.py", line 572, in aggregate
    "aggregate", values, how, axis, min_count=min_count
  File "/Users/williamayd/clones/pandas/pandas/core/groupby/ops.py", line 456, in _cython_operation
    "{dtype} dtype not supported".format(dtype=values.dtype)
NotImplementedError: category dtype not supported

@jbrockmendel for visibility. Looking at this on my end

@WillAyd WillAyd added Categorical Categorical Data Type Groupby Regression Functionality that used to work in a prior pandas version labels Nov 20, 2019
@WillAyd
Copy link
Member Author

WillAyd commented Nov 20, 2019

Note that this works with a frame on master:

>>> ser.to_frame().groupby([1, 1, 1, 1]).first()
       0
1  first

So I think just related to some of the exception handling in cython_agg_general and a lack of test coverage

@jbrockmendel
Copy link
Member

might be related to #28949

@simonjayhawkins
Copy link
Member

closing this in favour of #33090 since the bahaviour has changed and no longer raises NotImplementedError: category dtype not supported

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Groupby Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests

3 participants