-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
groupby categorical column fails with unstack #11558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this is a tricky bug actually; when indexing into a frame that has duplicates (or is a |
This seems related to this incongruency I also ran into:
If this dataframe is coming from a |
this has to do with how we handle uniques vs non-uniques. A Categorical Index is by definition non-unique (its actually unique in this case). But this might be a a buggie.
|
related to #11558 Author: sinhrks <[email protected]> Closes #12531 from sinhrks/cat_get_loc and squashes the following commits: 2749b62 [sinhrks] BUG: CategoricalIndex.get_loc returns array even if it is unique
Replicating example
The behaviour in this notebook seems like a bug to me. This is pandas 0.17.0.
In it,
g
andgcat
are the results of twodf.groupby(['medium', 'artist']).count().unstack()
operations. The only difference is that one of those operations is ondf
where one of the columns that thegroupby
operates over has been converted to Categorical.g
andgcat
behave very differently. I've tried to pin this down to the exact operation in the split-apply-combine that causes the problem without much luck.Slicing a column out of
g
returns a Series as expected, while slicing a column out ofgcat
returns a DataFrame (see cells 4 and 5).g.describe()
works as expected, butgcat.describe()
raises the exceptionand
g['painting'] + g['sculpture']
works as expected butg['painting'] + g['sculpture']
raisesThe text was updated successfully, but these errors were encountered: