Skip to content

Series.isin(values) raises ValueError if values is a set #12988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mao-liu opened this issue Apr 26, 2016 · 5 comments
Closed

Series.isin(values) raises ValueError if values is a set #12988

mao-liu opened this issue Apr 26, 2016 · 5 comments
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@mao-liu
Copy link

mao-liu commented Apr 26, 2016

When trying to use Series.isin(values) for Timestamps, an exception is raised if values is a set.

d = {'Dates':[pd.Timestamp('2013-01-02'),
              pd.Timestamp('2013-01-03'),
              pd.Timestamp('2013-01-04')],
     'Num1':[1,2,3],
     'Num2':[-1,-2,-3]}

df = pd.DataFrame(data=d)

>>> df['Dates'].isin({pd.Timestamp('2013-01-04')})
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)

>>> df['Dates'].isin([pd.Timestamp('2013-01-04')])
0    False
1    False
2     True
Name: Dates, dtype: bool

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 4.1.13-100.fc21.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_AU.utf8

pandas: 0.18.0
nose: None
pip: 8.1.1
setuptools: 20.2.2
Cython: None
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: 0.7.2
IPython: 4.1.2
sphinx: 1.3.5
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
@sinhrks sinhrks added Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves labels Apr 26, 2016
@mao-liu mao-liu changed the title Series isin set Series.isin(values) raises ValueError if values is a set Apr 26, 2016
@mao-liu
Copy link
Author

mao-liu commented Apr 26, 2016

Related stackoverflow: http://stackoverflow.com/questions/19070194/isin-function-does-not-work-for-dates

The bug raised in that SO post has been fixed, but the discussion between @cpcloud and the poster of the answer sheds light on why .isin works differently for sets (via __hash__) and lists (via __eq__).

@jreback
Copy link
Contributor

jreback commented Apr 26, 2016

yeh we should coerce to a before passing.

pull-requests welcome.

@jreback jreback added this to the Next Major Release milestone Apr 26, 2016
@mao-liu
Copy link
Author

mao-liu commented Apr 28, 2016

Error is raised by pd.to_datetime:

>>> import pandas as pd
>>> s = set(['20160420'])
>>> pd.to_datetime(s)
Traceback (most recent call last):
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 402, in _convert_listlike
    values, tz = tslib.datetime_to_datetime64(arg)
  File "pandas/tslib.pyx", line 1560, in pandas.tslib.datetime_to_datetime64 (pandas/tslib.c:29286)
    def datetime_to_datetime64(ndarray[object] values):
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/accounts/mliu/code/pandas/pandas/util/decorators.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 291, in to_datetime
    unit=unit, infer_datetime_format=infer_datetime_format)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 420, in _to_datetime
    return _convert_listlike(arg, box, format)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 405, in _convert_listlike
    raise e
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 391, in _convert_listlike
    require_iso8601=require_iso8601
  File "pandas/tslib.pyx", line 1964, in pandas.tslib.array_to_datetime (pandas/tslib.c:39434)
    cpdef array_to_datetime(ndarray[object] values, errors='raise',
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)
>>> pd.to_datetime(list(s))
DatetimeIndex(['2016-04-20'], dtype='datetime64[ns]', freq=None)

@jreback
Copy link
Contributor

jreback commented Apr 28, 2016

this is not a legit input to_datetime as these are ordered outputs

I suppose the error message could be better though

@mao-liu
Copy link
Author

mao-liu commented Apr 28, 2016

I added the set --> list coercion in pandas.core.algorithms.isin. See pull request.

@jreback jreback modified the milestones: 0.18.1, Next Major Release Apr 29, 2016
@jreback jreback closed this as completed in 05e734a May 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants