-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Unexpected results when filtering with .isin (some fields contain python datastructures) #20883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have a different output: In [7]: df.content.isin(v)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
TypeError: unhashable type: 'list'
The above exception was the direct cause of the following exception:
SystemError Traceback (most recent call last)
<ipython-input-7-5a60788e7bc7> in <module>()
----> 1 df.content.isin(v)
~/sandbox/pandas/pandas/core/series.py in isin(self, values)
3576 Name: animal, dtype: bool
3577 """
-> 3578 result = algorithms.isin(self, values)
3579 return self._constructor(result, index=self.index).__finalize__(self)
3580
~/sandbox/pandas/pandas/core/algorithms.py in isin(comps, values)
444 comps = comps.astype(object)
445
--> 446 return f(comps, values)
447
448
~/sandbox/pandas/pandas/core/algorithms.py in <lambda>(x, y)
419
420 # faster for larger cases to use np.in1d
--> 421 f = lambda x, y: htable.ismember_object(x, values)
422
423 # GH16012
~/sandbox/pandas/pandas/_libs/hashtable_func_helper.pxi in pandas._libs.hashtable.ismember_object()
470
471 kh_destroy_pymap(table)
--> 472 return result.view(np.bool_)
473
474
SystemError: <built-in method view of numpy.ndarray object at 0x1078a93f0> returned a result with an error set In general, nested data like this aren't well supported at the moment. The upcoming 0.23 release is laying some groundwork to better-support this, but it'll take some time. |
Similar issue, with a single value in the sdf.id.values, the following error occurs, with 2 or more values no error. (Pdb) df.isin(sdf.id.values) |
Still not working in Pandas version '0.24.2'. I am having the same error than @TomAugspurger using python 3.7.3. It worked perfectly in python 2.7.15. Any idea to sort this out? |
I don't think anyone has investigated deeply. Could you @javi-clear-image-ai? |
I did (a bit), but without much luck. I ended up moving from pandas to numpy (df.values) and working with the numpy array. It worked for me, so that would be the walk around I would suggest for the moment. |
Simpler test case:
(so unrelated to indexing, or |
The problem in my case was because my column instead to be an I just pass
This solved my problem. |
I think im having a similar issue. I have the following filter
But when using
Using Also of note, for the sake of it i tried updating the filter value from a list to just the string i wanted ,"coachListing", and received the same Finally, this column was originally an "object" dtype and i converted it to a string - the error happened in both cases. Any help / insight is much appreciated! |
Using pandas 1.2.4 with Python 3.9.2, it's still not possible to use I think most people would expect |
The original issue looks to work on master. Could use a test
|
Using pandas 1.4.1 with Python 3.10.2, the issue seems to be resolved. The original example code as well as the minimal example I posted earlier ( |
take |
Code Sample, a copy-pastable example if possible
Problem description
The first print statement executes sucessfully, filtering to the single row
'id': 2, 'content': u'whats going on'
, however the second filter throws an error even though the only difference is the length of one of the elements in the listv
.Output for the code snippet above:
pandas: 0.22.0
pytest: 2.9.2
pip: 9.0.1
setuptools: 36.4.0
Cython: None
numpy: 1.14.2
scipy: 0.18.1
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: 1.0.5
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: