You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@dfd Thanks for the report! That is indeed clearly a bug.
For example in sort_values, it takes the correct order into account, but rank was apparently missed.
In [6]: a.A.sort_values()
Out[6]:
0 first
1 second
2 third
3 fourth
4 fifth
5 sixth
Name: A, dtype: category
Categories (6, object): [first < second < third < fourth < fifth < sixth]
I think this should be a rather easy fix (in the pd.core.algorithms.rank, we should need to check for categorical, and then pass the underlying integer codes). If you would be interested in trying to do a pull request with a fix, always welcome!
check for categorical, and then pass the underlying integer codes.
closespandas-dev#15420
Author: Prasanjit Prakash <[email protected]>
Closespandas-dev#15422 from ikilledthecat/rank_categorical and squashes the following commits:
a7e573b [Prasanjit Prakash] moved test for categorical, in rank, to top
3ba4e3a [Prasanjit Prakash] corrections after rebasing
c43a029 [Prasanjit Prakash] using if/else construct to pick sorting function for categoricals
f8ec019 [Prasanjit Prakash] ask Categorical for ranking function
40d88c1 [Prasanjit Prakash] return values for rank from categorical object
049c0fc [Prasanjit Prakash] GH#15420 added support for na_option when ranking categorical
5e5bbeb [Prasanjit Prakash] BUG: GH#15420 rank for categoricals
ef999c3 [Prasanjit Prakash] merged with upstream master
fbaba1b [Prasanjit Prakash] return values for rank from categorical object
fa0b4c2 [Prasanjit Prakash] BUG: GH15420 - _rank private method on Categorical
9a6b5cd [Prasanjit Prakash] BUG: GH15420 - _rank private method on Categorical
4220e56 [Prasanjit Prakash] BUG: GH15420 - _rank private method on Categorical
6b70921 [Prasanjit Prakash] GH#15420 move rank inside categoricals
bf4e36c [Prasanjit Prakash] GH#15420 added support for na_option when ranking categorical
ce90207 [Prasanjit Prakash] BUG: GH#15420 rank for categoricals
85b267a [Prasanjit Prakash] Added support for categorical datatype in rank - issue#15420
Code Sample, a copy-pastable example if possible
Problem description
rank seems to be ignoring the order of ordered categories.
Expected Output
Output of
pd.show_versions()
pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.2.0
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: