Skip to content

.unique fails with read-only input. #19195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
allComputableThings opened this issue Jan 11, 2018 · 4 comments
Closed

.unique fails with read-only input. #19195

allComputableThings opened this issue Jan 11, 2018 · 4 comments
Labels
Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@allComputableThings
Copy link

Code Sample, a copy-pastable example if possible

arr = np.arange(10)
arr.setflags(write=False)
ser = pd.Series(arr, index=arr)
print ser.unique()

...

  File "/usr/local/lib/python2.7/dist-packages/pandas/core/algorithms.py", line 364, in unique
    uniques = table.unique(values)
  File "pandas/_libs/hashtable_class_helper.pxi", line 973, in pandas._libs.hashtable.Int64HashTable.unique
  File "stringsource", line 646, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 347, in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only

Problem description

For no good reason, unique tries to modify the input and fails if the input data is read only.

Perhaps should simply delegate to np.unique which does not modify the input.

Expected Output

Operation should not modify the input. Is rude to the caller and will doubtless lead to inexplicable concurrency issues.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-24-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None

pandas: 0.22.0
pytest: None
pip: 9.0.1
setuptools: 36.0.1
Cython: 0.27.3
numpy: 1.13.3
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: 0.2.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.1.1
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 1.7.0
xlrd: 0.9.2
xlwt: 0.7.5
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: 2.7.3.2 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jschendel
Copy link
Member

This works on master for me, and I suspect it was fixed by #18825

In [2]: arr = np.arange(10)

In [3]: arr.setflags(write=False)

In [4]: ser = pd.Series(arr, index=arr)

In [5]: ser.unique()
Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int64)

In [6]: pd.__version__
Out[6]: '0.23.0.dev0+98.g8acdf80'

@jreback
Copy link
Contributor

jreback commented Jan 11, 2018

yep this is in 0.23.0

@jreback jreback closed this as completed Jan 11, 2018
@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Duplicate Report Duplicate issue or pull request labels Jan 11, 2018
@jreback jreback added this to the No action milestone Jan 11, 2018
@jreback
Copy link
Contributor

jreback commented Jan 11, 2018

@stuz5000 note that this is actually a bug in cython
which pandas works around

@gabomgp
Copy link

gabomgp commented Apr 2, 2018

I'll put this here, delete if is inappropiate. But, the question is related with this bug:

https://stackoverflow.com/questions/49619588/how-to-prevent-bug-https-jiasu.xzqcsaa.nyc.mn-pandas-dev-pandas-issues-19195-with-dese

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

4 participants