Skip to content

NotImplementedError: > 1 ndim Categorical raised when array is read-only #15860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rvernica opened this issue Apr 1, 2017 · 1 comment
Closed
Labels
Compat pandas objects compatability with Numpy or Python functions Duplicate Report Duplicate issue or pull request

Comments

@rvernica
Copy link
Contributor

rvernica commented Apr 1, 2017

Code Sample, a copy-pastable example if possible

>>> import struct
>>> struct.pack('qq', 1, 2)
'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00'
>>> buf = struct.pack('qq', 1, 2)
>>> import numpy
>>> ar = numpy.frombuffer(buf, dtype=numpy.int64)
>>> ar
array([1, 2])
>>> import pandas
>>> pandas.MultiIndex.from_arrays([ar, numpy.array([10, 20])])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../.local/lib/python2.7/site-packages/pandas/indexes/multi.py", line 935, in from_arrays
    labels, levels = _factorize_from_iterables(arrays)
  File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 2068, in _factorize_from_iterables
    return map(list, lzip(*[_factorize_from_iterable(it) for it in iterables]))
  File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 2040, in _factorize_from_iterable
    cat = Categorical(values, ordered=True)
  File ".../.local/lib/python2.7/site-packages/pandas/core/categorical.py", line 300, in __init__
    raise NotImplementedError("> 1 ndim Categorical are not "
NotImplementedError: > 1 ndim Categorical are not supported at this time

In-depth look:

>>> pandas.core.categorical.factorize(ar, sort = True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../.local/lib/python2.7/site-packages/pandas/core/algorithms.py", line 313, in factorize
    labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
  File "pandas/src/hashtable_class_helper.pxi", line 485, in pandas.hashtable.Int64HashTable.get_labels (pandas/hashtable.c:9966)
  File "stringsource", line 644, in View.MemoryView.memoryview_cwrapper (pandas/hashtable.c:32223)
  File "stringsource", line 345, in View.MemoryView.memoryview.__cinit__ (pandas/hashtable.c:28458)
ValueError: buffer source array is read-only

Problem description

If a NumPy array backed by a read-only buffer is used in MultiInted.from_arrays a confuzsing NotImplementedError: > 1 ndim Categorical are not supported at this time exception is raised.

A more in-depth look at the code shows that, when using a NumPy array backed by a read-only buffer, the Categorical constructor (which raises the NotImplementedError exception) calls factorize which raises ValueError: buffer source array is read-only.

Maybe the read-only aspect raised by factorize should propagate to the user so the real cause of the exception is known. As currently implemented, the Categorical constructor reinterprets the exception and throws a confusing NotImplementedError: > 1 ndim Categorical are not supported at this time

Expected Output

ValueError: buffer source array is read-only

Output of pd.show_versions()

# Paste the output here pd.show_versions() here >>> pandas.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.13.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.6-100.fc24.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 20.1.1
Cython: 0.25.2
numpy: 1.11.0
scipy: 0.16.1
statsmodels: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 0.6.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.2rc2
openpyxl: None
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: None
bs4: 4.4.0
html5lib: 1.0b7
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: 2.45.0
pandas_datareader: 0.3.0.post

@jreback
Copy link
Contributor

jreback commented Apr 1, 2017

duplicate of #15286, with the cause in #12813 . Its actually not that hard to fix. pull-requests are welcome, though I will get to it after 0.20.0 is released.

@jreback jreback closed this as completed Apr 1, 2017
@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Duplicate Report Duplicate issue or pull request labels Apr 1, 2017
@jreback jreback added this to the No action milestone Apr 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants