Skip to content

Series/DataFrame.rank() doesn't handle certain floats properly #8365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sandbox opened this issue Sep 22, 2014 · 2 comments · Fixed by #8379
Closed

Series/DataFrame.rank() doesn't handle certain floats properly #8365

sandbox opened this issue Sep 22, 2014 · 2 comments · Fixed by #8379
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug Numeric Operations Arithmetic, Comparison, and Logical operations
Milestone

Comments

@sandbox
Copy link

sandbox commented Sep 22, 2014

There appears to be an issue with floats that are close together in series.rank(), pandas version 0.14. For reference this test worked in pandas 0.12.0.

Current Behavior

>>> series = pd.Series([1000.000669 , 1000.000041 , 1000.000059 , 1000.000063 , 1000.000121 , 1000.000104 , 1000.000040 , 1000.000062 , 1000.000095 , 1000.000091 , 1000.000050 , 1000.000074 , 1000.000063 , 1000.000076 , 1000.000083 , 1000.000061 , 1000.000030 , 1000.000069 , 1000.000090 , 1000.000116 , 1000.000058 , 1000.000074 , 1000.000035 , 1000.000084 , 1000.000067 , 1000.000072 , 1000.000105 , 1000.000091 , 1000.000077 , 1000.000040 , 1000.000108 , 1000.000117 , 1000.000114 , 1000.000117 , 1000.000099 , 1000.000039 , 1000.000046 , 1000.000105 , 1000.000057])
>>> series.rank()
0     39.0
1     19.5
2     19.5
3     19.5
4     19.5
5     19.5
6     19.5
7     19.5
8     19.5
9     19.5
10    19.5
11    19.5
12    19.5
13    19.5
14    19.5
15    19.5
16    19.5
17    19.5
18    19.5
19    19.5
20    19.5
21    19.5
22    19.5
23    19.5
24    19.5
25    19.5
26    19.5
27    19.5
28    19.5
29    19.5
30    19.5
31    19.5
32    19.5
33    19.5
34    19.5
35    19.5
36    19.5
37    19.5
38    19.5
dtype: float64

Expected Behavior

>>> from scipy import stats
>>> stats.rankdata(series)
array([ 39. ,   6. ,  11. ,  14.5,  38. ,  30. ,   4.5,  13. ,  28. ,
        26.5,   8. ,  19.5,  14.5,  21. ,  23. ,  12. ,   1. ,  17. ,
        25. ,  35. ,  10. ,  19.5,   2. ,  24. ,  16. ,  18. ,  31.5,
        26.5,  22. ,   4.5,  33. ,  36.5,  34. ,  36.5,  29. ,   3. ,
         7. ,  31.5,   9. ])

System Information

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Darwin
OS-release: 13.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.14.1
nose: 1.3.4
Cython: 0.21
numpy: 1.8.1
scipy: 0.14.0
statsmodels: 0.5.0
IPython: None
sphinx: None
patsy: 0.3.0
scikits.timeseries: None
dateutil: 2.2
pytz: 2014.7
bottleneck: 0.8.0
tables: 3.0.0
numexpr: 2.4
matplotlib: None
openpyxl: 2.1.0
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.8.0
pymysql: None
psycopg2: 2.5.4 (dt dec pq3 ext)

@TomAugspurger
Copy link
Contributor

It's using this function if you want to poke around and see what's going on.

@jreback jreback added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Numeric Operations Arithmetic, Comparison, and Logical operations labels Sep 23, 2014
@jreback jreback added this to the 0.15.1 milestone Sep 23, 2014
@jreback jreback added the Bug label Sep 23, 2014
@jreback
Copy link
Contributor

jreback commented Sep 24, 2014

Here is the issue that original changed this (recently): #6886

to handle REALLY small floats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants