Skip to content

BUG: __getitem__ on MultiIndex with empty slice raises #15454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomAugspurger opened this issue Feb 18, 2017 · 1 comment · Fixed by #31171
Closed

BUG: __getitem__ on MultiIndex with empty slice raises #15454

TomAugspurger opened this issue Feb 18, 2017 · 1 comment · Fixed by #31171
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@TomAugspurger
Copy link
Contributor

Code Sample, a copy-pastable example if possible

# Your code here
In [26]: df = pd.DataFrame(0, index=range(2), columns=pd.MultiIndex.from_product([[1], [2]]))
    ...:
    ...:

In [27]: df
Out[27]:
   1
   2
0  0
1  0

In [28]: df[[]]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-28-13353c6d99d3> in <module>()
----> 1 df[[]]

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/frame.py in __getitem__(self, key)
   2014         if isinstance(key, (Series, np.ndarray, Index, list)):
   2015             # either boolean or fancy integer index
-> 2016             return self._getitem_array(key)
   2017         elif isinstance(key, DataFrame):
   2018             return self._getitem_frame(key)

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/frame.py in _getitem_array(self, key)
   2058             return self.take(indexer, axis=0, convert=False)
   2059         else:
-> 2060             indexer = self.loc._convert_to_indexer(key, axis=1)
   2061             return self.take(indexer, axis=1, convert=True)
   2062

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
   1216                 # this is not the most robust, but...
   1217                 if (isinstance(labels, MultiIndex) and
-> 1218                         not isinstance(objarr[0], tuple)):
   1219                     level = 0
   1220                     _, indexer = labels.reindex(objarr, level=level)

IndexError: index 0 is out of bounds for axis 0 with size 0

Problem description

This is inconsistent with our other indexers, which all return empty DataFrames / series

  1. Regular index in the columns or index
In [30]: df.T[[]]
Out[30]:
Empty DataFrame
Columns: []
Index: [(1, 2)]
In [32]: df.loc[[]]
Out[32]:
Empty DataFrame
Columns: [(1, 2)]
Index: []
  1. MultiIndex in the index with .loc
In [34]: df.T.loc[[]]
Out[34]:
Empty DataFrame
Columns: [0, 1]
Index: []

A probably related bug: while df.loc[:, []] works as expected, __setitem__ raises with the same error:

In [65]: df.loc[:, []]  # OK
Out[65]:
Empty DataFrame
Columns: []
Index: [0, 1]
In [66]: df.loc[:, []] = 10  # should be a no-op really
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-66-6cd48eddd6a3> in <module>()
----> 1 df.loc[:, []] = 10

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in __setitem__(self, key, value)
    146         else:
    147             key = com._apply_if_callable(key, self.obj)
--> 148         indexer = self._get_setitem_indexer(key)
    149         self._setitem_with_indexer(indexer, value)
    150

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _get_setitem_indexer(self, key)
    125         if isinstance(key, tuple):
    126             try:
--> 127                 return self._convert_tuple(key, is_setter=True)
    128             except IndexingError:
    129                 pass

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _convert_tuple(self, key, is_setter)
    192                 if i >= self.obj.ndim:
    193                     raise IndexingError('Too many indexers')
--> 194                 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
    195                 keyidx.append(idx)
    196         return tuple(keyidx)

/Users/tom.augspurger/Envs/py3/lib/python3.6/site-packages/pandas/pandas/core/indexing.py in _convert_to_indexer(self, obj, axis, is_setter)
   1216                 # this is not the most robust, but...
   1217                 if (isinstance(labels, MultiIndex) and
-> 1218                         not isinstance(objarr[0], tuple)):
   1219                     level = 0
   1220                     _, indexer = labels.reindex(objarr, level=level)

IndexError: index 0 is out of bounds for axis 0 with size 0

Expected Output

In [47]: pd.DataFrame([], index=[0, 1])
Out[47]:
Empty DataFrame
Columns: []
Index: [0, 1]

I think that's right, all the columns should be dropped.

Output of pd.show_versions()

# Paste the output here pd.show_versions() here

INSTALLED VERSIONS

commit: 0ceb40f
python: 3.6.0.final.0
python-bits: 64
OS: Darwin
OS-release: 16.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.19.0+466.g0ceb40fde.dirty
pytest: 3.0.5
pip: 9.0.1
setuptools: 32.3.0
Cython: 0.25.2
numpy: 1.12.0
scipy: 0.18.1
xarray: None
IPython: 5.2.2
sphinx: 1.5.2
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: None
numexpr: 2.6.1
feather: None
matplotlib: 2.0.0
openpyxl: None
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.9.5
s3fs: 0.0.8
pandas_datareader: None

@TomAugspurger TomAugspurger added Difficulty Intermediate Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Feb 18, 2017
@TomAugspurger TomAugspurger added this to the 0.20.0 milestone Feb 18, 2017
@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
@mroeschke
Copy link
Member

Looks to work on master. Could use a test.

In [266]: In [47]: pd.DataFrame([], index=[0, 1])
     ...:
Out[266]:
Empty DataFrame
Columns: []
Index: [0, 1]

In [267]: pd.__version__
Out[267]: '0.26.0.dev0+593.g9d45934af'

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Difficulty Intermediate Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Oct 21, 2019
@jreback jreback modified the milestones: Contributions Welcome, 1.1 Jan 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants