BUG: MultiIndex Slice returns incorrect slice (compared to .xs()) #27591

groutr · 2019-07-25T17:53:09Z

Code Sample, a copy-pastable example if possible

import pandas as pd
l1 = list('abc')
l2 = [(0, 1), (1, 0)]
l3 = [0, 1]
cols = pd.MultiIndex.from_product([l1, l2, l3], names=['x', 'y', 'z'])
df = pd.DataFrame(index=range(5), columns=cols)

# This returns the expected slice
expected = df.xs((l1[0], l2[0], l3[0]), level=[0, 1, 2], axis=1)
print(expected)

# This returns an empty Series (?!)
empty1 = df.loc[:, (l1[0], l2[0], l3[0])]
# This returns an empty Series also
empty2 = df.loc[:, pd.IndexSlice[l1[0], l2[0], l3[0]]]
print(empty1)
print(empty2)

# if we omit the l2 dim on slice, we get something back
try3 = df.loc[:, pd.IndexSlice[l1[0], :, l3[0]]]
print(try3)

# Oddly enough, we can still set values on the dataframe and it works as expected.
df.loc[:, (l1[0], l2[0], l3[0])] = list(range(10, 15))
print(df)

Problem description

As far as I can understand the, all three indexing operations should provide the same result. Slicing seems to be the favored approach in the documentation, yet it doesn't return the expected result in this case.

Wrapping the tuple in a python object doesn't seem to work either, returning a KeyError.

class MyTuple(object):
    def __init__(self, data):
        self.data = data
    def __eq__(self, other):
        return self.data == other.data
    def __hash__(self):
        return hash(self.data)
    def __repr__(self):
        return repr(self.data)
    def __str__(self):
        return repr(self)

l22 = [MyTuple((0,1)), MyTuple((1, 0))]
df2 = pd.DataFrame(index=range(5), columns=pd.MultiIndex.from_product([l1, l22, l3]))
df.loc[:, pd.IndexSlice[l1[0], l22[0], l3[0]]]

Expected Output

The expected result is a single series with 5 NaN values:

This is the value of expected in the example above.

Output of `pd.show_versions()`

Note that pandas 0.24.2 is being used because compatibility with Python 2 is needed.

INSTALLED VERSIONS

commit: None
python: 3.7.3.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-696.10.2.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.2
pytest: 4.3.1
pip: 19.0.3
setuptools: 40.8.0
Cython: None
numpy: 1.16.2
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: None
patsy: None
dateutil: 2.8.0
pytz: 2018.9
blosc: 1.7.0
bottleneck: 1.2.1
tables: None
numexpr: None
feather: None
matplotlib: 3.0.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: 1.3.1
pymysql: 0.9.3
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

The text was updated successfully, but these errors were encountered:

jbrockmendel · 2021-06-30T01:23:34Z

Best guess ATM is that this is treating the tuple in the second level as a sequence of labels so is going through the MultiIndex.get_locs path

jbrockmendel added Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Jul 26, 2019

This was referenced Jun 30, 2021

BUG: indexing with missing labels deprecation not applied to MultiIndex #39424

Open

ERR: Improve error message for Series.loc.getitem with too many dimensions #39372

Merged

BUG: .loc with MultiIndex with tuple in level GH#27591 #42329

Merged

mroeschke added the Bug label Jul 10, 2021

jreback added this to the 1.4 milestone Aug 8, 2021

jreback closed this as completed in #42329 Aug 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: MultiIndex Slice returns incorrect slice (compared to .xs()) #27591

BUG: MultiIndex Slice returns incorrect slice (compared to .xs()) #27591

groutr commented Jul 25, 2019

INSTALLED VERSIONS

jbrockmendel commented Jun 30, 2021

BUG: MultiIndex Slice returns incorrect slice (compared to .xs()) #27591

BUG: MultiIndex Slice returns incorrect slice (compared to .xs()) #27591

Comments

groutr commented Jul 25, 2019

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

jbrockmendel commented Jun 30, 2021

Output of `pd.show_versions()`