Skip to content

BUG: Groupby(sort=False) with datetime-like Categorical raises ValueError #10505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sinhrks opened this issue Jul 3, 2015 · 0 comments
Closed
Labels
Bug Categorical Categorical Data Type
Milestone

Comments

@sinhrks
Copy link
Member

sinhrks commented Jul 3, 2015

Related to #10501, but not the same. groupby can accept Categorical and sort keyword.

df = pd.DataFrame({'A': [1, 2, 3 ,4], 'B': [5, 6, 7, 8]})

# OK
df.groupby(pd.Categorical(['A', 'B', 'A', 'B'])).groups
# {'A': [0, 2], 'B': [1, 3]}

# OK
df.groupby(pd.Categorical(['A', 'B', 'A', 'B']), sort=False).groups
# {'A': [0, 2], 'B': [1, 3]}

If Categorical has datetime-like categories, groupby fails if sort=False is specified.

# OK
df.groupby(pd.Categorical(pd.DatetimeIndex(['2011', '2012', '2011', '2012']))).groups
# {numpy.datetime64('2011-01-01T09:00:00.000000000+0900'): [0, 2], 
#  numpy.datetime64('2012-01-01T09:00:00.000000000+0900'): [1, 3]}

# NG
df.groupby(pd.Categorical(pd.DatetimeIndex(['2011', '2012', '2011', '2012'])), sort=False).groups
# ValueError: items in new_categories are not the same as in old categories
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type
Projects
None yet
Development

No branches or pull requests

1 participant