You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "<module>", line 1, in <module>
File "pypy_stuff/pypy-latest/site-packages/pandas/core/series.py", line 1241, in unique
result = super(Series, self).unique()
File "pypy_stuff/pypy-latest/site-packages/pandas/core/base.py", line 973, in unique
result = unique1d(values)
File "pypy_stuff/pypy-latest/site-packages/pandas/core/nanops.py", line 811, in unique1d
uniques = table.unique(_ensure_object(values))
File "pandas/src/hashtable_class_helper.pxi", line 826, in pandas.hashtable.PyObjectHashTable.unique (pandas/hashtable.c:14521)
ValueError: cannot resize an array with refcheck=True on PyPy.
Use the resize function or refcheck=False
Calls to array.resize(..., refcheck=True) (note that refcheck=True is the default) check that data reallocation is needed, and if so checks that no other object is dependent on array by testing the array object's refcount. This check is unreliable on PyPy, since we do not use a reference counting garbage collector.
I started a branch to simply expose the refcheck keword through the various places needed, the commit can be seen on my fork of pandas here.
The caller in this patch can know with certainty that there are no other users of the object, and so refcheck=False is safe. The changes are a bit intrusive, so I am opening this as an issue first hoping it generates some discussion before I issue a pull request.
The text was updated successfully, but these errors were encountered:
@mattip I think you could just change this to refcheck=False. This is used internally to avoid holding the GIL for a variable length array (for numeric types); for object dtypes this doesn't do anything. The .resize is purely internal to the routine; ultimately this is then returned to the user.
jreback
added
Compat
pandas objects compatability with Numpy or Python functions
Internals
Related to non-user accessible pandas implementation
labels
Mar 31, 2017
The change unfortunately percolates out to pure python code where uniques is allocated, see for instance the diff in factorize() from algorithms.py in pull request #16193
PyPy 5.7 can build and pip install pandas. Most of the tests pass, instructions to reproduce are here However this code
does not work on PyPy:
Calls to
array.resize(..., refcheck=True)
(note that refcheck=True is the default) check that data reallocation is needed, and if so checks that no other object is dependent onarray
by testing thearray
object's refcount. This check is unreliable on PyPy, since we do not use a reference counting garbage collector.I started a branch to simply expose the
refcheck
keword through the various places needed, the commit can be seen on my fork of pandas here.The caller in this patch can know with certainty that there are no other users of the object, and so refcheck=False is safe. The changes are a bit intrusive, so I am opening this as an issue first hoping it generates some discussion before I issue a pull request.
The text was updated successfully, but these errors were encountered: