Skip to content

indexing.py: "'bool' object has no attribtute 'any'" with duplicate time index #17105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
elDan101 opened this issue Jul 28, 2017 · 2 comments · Fixed by #20939
Closed

indexing.py: "'bool' object has no attribtute 'any'" with duplicate time index #17105

elDan101 opened this issue Jul 28, 2017 · 2 comments · Fixed by #20939
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@elDan101
Copy link

Code Sample, a copy-pastable example if possible

trange = pd.date_range(start=pd.Timestamp(year=2017, month=1, day=1), end=pd.Timestamp(year=2017, month=1, day=5))
# make a duplicate
trange = trange.insert(loc=5, item=pd.Timestamp(year=2017, month=1, day=5))

df = pd.DataFrame(0, index=trange, columns=["A", "B"])
bool_idx = np.array([True, False, False, False, False, True])
df.loc[trange[bool_idx], "A"] += 1

Throws error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-32-b0e70145e9a6> in <module>()
----> 1 df.loc[trange[bool_idx], "A"] += 1

/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    176         else:
    177             key = com._apply_if_callable(key, self.obj)
--> 178         indexer = self._get_setitem_indexer(key)
    179         self._setitem_with_indexer(indexer, value)
    180 

/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in _get_setitem_indexer(self, key)
    155         if isinstance(key, tuple):
    156             try:
--> 157                 return self._convert_tuple(key, is_setter=True)
    158             except IndexingError:
    159                 pass

/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in _convert_tuple(self, key, is_setter)
    222                 if i >= self.obj.ndim:
    223                     raise IndexingError('Too many indexers')
--> 224                 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
    225                 keyidx.append(idx)
    226         return tuple(keyidx)

/usr/local/lib/python3.6/site-packages/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
   1228 
   1229                 mask = check == -1
-> 1230                 if mask.any():
   1231                     raise KeyError('%s not in index' % objarr[mask])
   1232 

AttributeError: 'bool' object has no attribute 'any'

Problem description

First I wasn't aware that I have a duplicated index in my code -- I only realised this when trying to reproduce the error in the example above. I used time index, because in my code I have a time index too.

The problem is in indexing.py line 1235 and 1236, where the variable mask is a python builtin bool (instead of a numpy array) and therefore does not support the method .any() .

Expected Output

I am not sure if the code is legal with duplicates (I will filter duplicates in my code now). Nonetheless, I think a decent error message should be raised in such a case.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.1.final.0 python-bits: 64 OS: Linux OS-release: 4.8.0-58-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 28.8.0
Cython: None
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@gfyoung
Copy link
Member

gfyoung commented Jul 28, 2017

I am not sure if the code is legal with duplicates (I will filter duplicates in my code now). Nonetheless, I think a decent error message should be raised in such a case.

We generally would discourage against having duplicates, regardless if in the index or in columns. That being said, I don't know exactly where we stand in terms of supporting duplicates like this. So either we would have to support them better OR raise with a better error message as you said.

cc @jreback @jorisvandenbossche

@gfyoung gfyoung added the Indexing Related to indexing on series/frames, not to indexes themselves label Jul 28, 2017
@jorisvandenbossche
Copy link
Member

I don't think there is a reason that this shouldn't work. Duplicate indices for certain operations are not supported (eg reindexing) and are often not advisable, but for basic indexing certainly supported.
BTW, the gettting also works fine:

In [139]: df.loc[trange[bool_idx], "A"]
Out[139]: 
2017-01-01    0
2017-01-05    0
2017-01-05    0
Name: A, dtype: int64

so only when setting the value it fails.

Further, the equivalent example using an integer index also seems to work correctly:

In [146]: df = pd.DataFrame(0, index=[1,2,3,4,5,5], columns=['A', 'B'])

In [147]: df
Out[147]: 
   A  B
1  0  0
2  0  0
3  0  0
4  0  0
5  0  0
5  0  0

In [148]: df.loc[[1, 5], 'A'] += 1

In [149]: df
Out[149]: 
   A  B
1  1  0
2  0  0
3  0  0
4  0  0
5  1  0
5  1  0

@elDan101 elDan101 changed the title indexing.py: 'bool' object has no attribtute 'any' with duplicate index indexing.py: "'bool' object has no attribtute 'any'" with duplicate time index Jul 28, 2017
fersarr pushed a commit to fersarr/pandas that referenced this issue May 3, 2018
fersarr pushed a commit to fersarr/pandas that referenced this issue May 3, 2018
fersarr pushed a commit to fersarr/pandas that referenced this issue May 3, 2018
@jreback jreback added this to the 0.23.0 milestone May 3, 2018
fersarr pushed a commit to fersarr/pandas that referenced this issue May 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants