pd.cut fails on readonly arrays #18773

shoyer · 2017-12-14T02:17:44Z

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

bins = np.arange(0, 100, 10)
bins.flags.writeable = False
pd.cut(np.arange(100), bins)

Results in:

ValueError                                Traceback (most recent call last)
<ipython-input-71-60b85be1194d> in <module>()
----> 1 pd.cut(np.arange(100), bins)

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/reshape/tile.py in cut(x, bins, right, labels, retbins, precision, include_lowest)
    134                               precision=precision,
    135                               include_lowest=include_lowest,
--> 136                               dtype=dtype)
    137
    138     return _postprocess_for_cut(fac, bins, retbins, x_is_series,

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/reshape/tile.py in _bins_to_cuts(x, bins, right, labels, precision, include_lowest, dtype, duplicates)
    225         return result, bins
    226
--> 227     unique_bins = algos.unique(bins)
    228     if len(unique_bins) < len(bins) and len(bins) != 2:
    229         if duplicates == 'raise':

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/algorithms.py in unique(values)
    354
    355     table = htable(len(values))
--> 356     uniques = table.unique(values)
    357     uniques = _reconstruct_data(uniques, dtype, original)
    358

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.unique (pandas/_libs/hashtable.c:16341)()

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/_libs/hashtable.cpython-36m-darwin.so in View.MemoryView.memoryview_cwrapper (pandas/_libs/hashtable.c:45205)()

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/_libs/hashtable.cpython-36m-darwin.so in View.MemoryView.memoryview.__cinit__ (pandas/_libs/hashtable.c:41440)()

ValueError: buffer source array is read-only

Problem description

This is essentially the same problem as #10043 and #17192, due to Cython issue cython/cython#1605

Expected Output

Should not error.

Output of `pd.show_versions()`

In [73]: pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.4.0
Cython: 0.26
numpy: 1.13.1
scipy: 0.19.1
xarray: 0.10.0
IPython: 6.1.0
sphinx: 1.6.3
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

This problem was brought up in pandas-dev#18773 and effectively comes down to how Cython deals with readonly arrays. While it would be ideal for Cython to fix the underlying problem in the meantime we can rely on this.

hexgnu · 2017-12-18T21:21:29Z

Hey everybody, I submitted a pull request fixing this the best I can.

Unfortunately it seems that Cython doesn't handle this super well so the running solution is to check whether something is writeable or not and then to fall back onto ndarray if needed.

This problem was brought up in pandas-dev#18773 and effectively comes down to how Cython deals with readonly arrays. While it would be ideal for Cython to fix the underlying problem in the meantime we can rely on this.

This problem was brought up in pandas-dev#18773 and effectively comes down to how Cython deals with readonly arrays. While it would be ideal for Cython to fix the underlying problem in the meantime we can rely on this. fix: updates one_to_hundred for hundred_elements This is because arange(100) isn't actually 1 to 100... it's 0 to 99 docs: adds comment to fix using ndarray and fixes indenting test: parametrize test for test_readonly_cut doc: add new whatsnew entry for v0.23.0 fix: checkout existing upstream v0.22.0

allComputableThings · 2019-03-30T04:48:08Z

FYI: It looks like Cython closed the source issue with const arrays in February.
cython/cython#1605

shoyer mentioned this issue Dec 14, 2017

Error when using .apply_ufunc with .groupby_bins pydata/xarray#1765

Closed

jreback added Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate Reshaping Concat, Merge/Join, Stack/Unstack, Explode and removed Effort Medium labels Dec 15, 2017

jreback added this to the Next Major Release milestone Dec 15, 2017

hexgnu mentioned this issue Dec 18, 2017

BUG make hashtable.unique support readonly arrays #18825

Merged

4 tasks

jreback modified the milestones: Next Major Release, 0.22.0 Dec 19, 2017

jreback closed this as completed in #18825 Dec 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pd.cut fails on readonly arrays #18773

pd.cut fails on readonly arrays #18773

shoyer commented Dec 14, 2017

INSTALLED VERSIONS

hexgnu commented Dec 18, 2017

allComputableThings commented Mar 30, 2019

pd.cut fails on readonly arrays #18773

pd.cut fails on readonly arrays #18773

Comments

shoyer commented Dec 14, 2017

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

hexgnu commented Dec 18, 2017

allComputableThings commented Mar 30, 2019

Output of `pd.show_versions()`