Skip to content

pd.cut fails on readonly arrays #18773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shoyer opened this issue Dec 14, 2017 · 2 comments · Fixed by #18825
Closed

pd.cut fails on readonly arrays #18773

shoyer opened this issue Dec 14, 2017 · 2 comments · Fixed by #18825
Labels
Compat pandas objects compatability with Numpy or Python functions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@shoyer
Copy link
Member

shoyer commented Dec 14, 2017

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

bins = np.arange(0, 100, 10)
bins.flags.writeable = False
pd.cut(np.arange(100), bins)

Results in:

ValueError                                Traceback (most recent call last)
<ipython-input-71-60b85be1194d> in <module>()
----> 1 pd.cut(np.arange(100), bins)

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/reshape/tile.py in cut(x, bins, right, labels, retbins, precision, include_lowest)
    134                               precision=precision,
    135                               include_lowest=include_lowest,
--> 136                               dtype=dtype)
    137
    138     return _postprocess_for_cut(fac, bins, retbins, x_is_series,

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/reshape/tile.py in _bins_to_cuts(x, bins, right, labels, precision, include_lowest, dtype, duplicates)
    225         return result, bins
    226
--> 227     unique_bins = algos.unique(bins)
    228     if len(unique_bins) < len(bins) and len(bins) != 2:
    229         if duplicates == 'raise':

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/algorithms.py in unique(values)
    354
    355     table = htable(len(values))
--> 356     uniques = table.unique(values)
    357     uniques = _reconstruct_data(uniques, dtype, original)
    358

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.unique (pandas/_libs/hashtable.c:16341)()

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/_libs/hashtable.cpython-36m-darwin.so in View.MemoryView.memoryview_cwrapper (pandas/_libs/hashtable.c:45205)()

~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/_libs/hashtable.cpython-36m-darwin.so in View.MemoryView.memoryview.__cinit__ (pandas/_libs/hashtable.c:41440)()

ValueError: buffer source array is read-only

Problem description

This is essentially the same problem as #10043 and #17192, due to Cython issue cython/cython#1605

Expected Output

Should not error.

Output of pd.show_versions()

In [73]: pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 36.4.0
Cython: 0.26
numpy: 1.13.1
scipy: 0.19.1
xarray: 0.10.0
IPython: 6.1.0
sphinx: 1.6.3
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate Reshaping Concat, Merge/Join, Stack/Unstack, Explode and removed Effort Medium labels Dec 15, 2017
@jreback jreback added this to the Next Major Release milestone Dec 15, 2017
hexgnu added a commit to hexgnu/pandas that referenced this issue Dec 18, 2017
This problem was brought up in
pandas-dev#18773 and effectively comes
down to how Cython deals with readonly arrays. While it would be ideal
for Cython to fix the underlying problem in the meantime we can rely on
this.
hexgnu added a commit to hexgnu/pandas that referenced this issue Dec 18, 2017
This problem was brought up in
pandas-dev#18773 and effectively comes
down to how Cython deals with readonly arrays. While it would be ideal
for Cython to fix the underlying problem in the meantime we can rely on
this.
hexgnu added a commit to hexgnu/pandas that referenced this issue Dec 18, 2017
This problem was brought up in
pandas-dev#18773 and effectively comes
down to how Cython deals with readonly arrays. While it would be ideal
for Cython to fix the underlying problem in the meantime we can rely on
this.
hexgnu added a commit to hexgnu/pandas that referenced this issue Dec 18, 2017
This problem was brought up in
pandas-dev#18773 and effectively comes
down to how Cython deals with readonly arrays. While it would be ideal
for Cython to fix the underlying problem in the meantime we can rely on
this.
@hexgnu
Copy link
Contributor

hexgnu commented Dec 18, 2017

Hey everybody, I submitted a pull request fixing this the best I can.

Unfortunately it seems that Cython doesn't handle this super well so the running solution is to check whether something is writeable or not and then to fall back onto ndarray if needed.

hexgnu added a commit to hexgnu/pandas that referenced this issue Dec 18, 2017
This problem was brought up in
pandas-dev#18773 and effectively comes
down to how Cython deals with readonly arrays. While it would be ideal
for Cython to fix the underlying problem in the meantime we can rely on
this.
@jreback jreback modified the milestones: Next Major Release, 0.22.0 Dec 19, 2017
hexgnu added a commit to hexgnu/pandas that referenced this issue Dec 22, 2017
This problem was brought up in
pandas-dev#18773 and effectively comes
down to how Cython deals with readonly arrays. While it would be ideal
for Cython to fix the underlying problem in the meantime we can rely on
this.

fix: updates one_to_hundred for hundred_elements

This is because arange(100) isn't actually 1 to 100... it's 0 to 99

docs: adds comment to fix using ndarray and fixes indenting

test: parametrize test for test_readonly_cut

doc: add new whatsnew entry for v0.23.0

fix: checkout existing upstream v0.22.0
@allComputableThings
Copy link

FYI: It looks like Cython closed the source issue with const arrays in February.
cython/cython#1605

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants