BUG: rank raises error with read-only data #37290

zeromh · 2020-10-20T21:51:36Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.

Code Sample, a copy-pastable example

import pandas as pd
import numpy as np
arr = np.arange(10)
arr.setflags(write=False)
pd.Series(arr).rank()

Output:

ValueError                                Traceback (most recent call last)
<ipython-input-5-afa6b4ecf509> in <module>
      3 arr = np.arange(10)
      4 arr.setflags(write=False)
----> 5 pd.Series(arr).rank()

~/anaconda/envs/xfactor/lib/python3.8/site-packages/pandas/core/generic.py in rank(self, axis, method, numeric_only, na_option, ascending, pct)
   8334         if numeric_only is None:
   8335             try:
-> 8336                 return ranker(self)
   8337             except TypeError:
   8338                 numeric_only = True

~/anaconda/envs/xfactor/lib/python3.8/site-packages/pandas/core/generic.py in ranker(data)
   8319 
   8320         def ranker(data):
-> 8321             ranks = algos.rank(
   8322                 data.values,
   8323                 axis=axis,

~/anaconda/envs/xfactor/lib/python3.8/site-packages/pandas/core/algorithms.py in rank(values, axis, method, na_option, ascending, pct)
    934     if values.ndim == 1:
    935         values = _get_values_for_rank(values)
--> 936         ranks = algos.rank_1d(
    937             values,
    938             ties_method=method,

pandas/_libs/algos.pyx in pandas._libs.algos.rank_1d()

~/anaconda/envs/xfactor/lib/python3.8/site-packages/pandas/_libs/algos.cpython-38-darwin.so in View.MemoryView.memoryview_cwrapper()

~/anaconda/envs/xfactor/lib/python3.8/site-packages/pandas/_libs/algos.cpython-38-darwin.so in View.MemoryView.memoryview.__cinit__()

ValueError: buffer source array is read-only

Problem description

rank should work with read-only data.

I noticed the problem when using check_estimator from sklearn.utils.estimator_checks on an estimator that uses pandas rank. I haven't explored fully but I assume check_estimator uses read-only data for running its tests, which causes this error.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : db08276
python : 3.8.5.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.1.3
numpy : 1.19.1
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.3
setuptools : 49.6.0.post20200814
Cython : None
pytest : 6.0.2
hypothesis : None
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.18.1
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.2
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2020-10-21T07:18:32Z

Thanks for the report, confirmed the regression (was already broken in pandas 1.0 as well, but did work in 0.25)

jorisvandenbossche · 2020-10-27T08:05:49Z

This was caused by #28978, fix at #37439

zeromh added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 20, 2020

jorisvandenbossche added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Regression Functionality that used to work in a prior pandas version and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 21, 2020

jorisvandenbossche added this to the 1.1.4 milestone Oct 21, 2020

jorisvandenbossche mentioned this issue Oct 27, 2020

REGR: fix rank algo for read-only data #37439

Merged

jreback closed this as completed in #37439 Oct 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: rank raises error with read-only data #37290

BUG: rank raises error with read-only data #37290

zeromh commented Oct 20, 2020

INSTALLED VERSIONS

jorisvandenbossche commented Oct 21, 2020

jorisvandenbossche commented Oct 27, 2020

BUG: rank raises error with read-only data #37290

BUG: rank raises error with read-only data #37290

Comments

zeromh commented Oct 20, 2020

Code Sample, a copy-pastable example

Problem description

Output of pd.show_versions()

INSTALLED VERSIONS

jorisvandenbossche commented Oct 21, 2020

jorisvandenbossche commented Oct 27, 2020

Output of `pd.show_versions()`