Skip to content

BUG: loc raises inconsistent error on unsorted MultiIndex #12790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.18.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -233,3 +233,4 @@ Bug Fixes
- Bug in ``.describe()`` resets categorical columns information (:issue:`11558`)
- Bug where ``loffset`` argument was not applied when calling ``resample().count()`` on a timeseries (:issue:`12725`)
- ``pd.read_excel()`` now accepts path objects (e.g. ``pathlib.Path``, ``py.path.local``) for the file path, in line with other ``read_*`` functions (:issue:`12655`)
- Bug in ``.loc`` raises inconsistent error when called on an unsorted ``MultiIndex`` (:issue:12660)
14 changes: 14 additions & 0 deletions pandas/indexes/multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -1595,6 +1595,8 @@ def get_loc_level(self, key, level=0, drop_level=True):
----------
key : label or tuple
level : int/level name or list thereof
drop_level : bool
drop a level from the index if only a single element is selected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this for?


Returns
-------
Expand Down Expand Up @@ -1638,6 +1640,18 @@ def maybe_droplevels(indexer, levels, drop_level):
if isinstance(key, list):
key = tuple(key)

# must be lexsorted to at least as many levels as the level parameter,
# or the number of items in the key tuple.
# Note: level is 0-based
required_lexsort_depth = level + 1
if isinstance(key, tuple):
required_lexsort_depth = max(required_lexsort_depth, len(key))
if self.lexsort_depth < required_lexsort_depth:
raise KeyError('MultiIndex Slicing requires the index to be '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this error message could be clearer, something like:

'MultiIndex slicing requires the index to be fully lexsorted up to level {required}, lexsort depth is currently {current}'

Just make it clearer which number is the required one and which one is the current.

'fully lexsorted tuple len ({0}), lexsort depth '
'({1})'.format(required_lexsort_depth,
self.lexsort_depth))

if isinstance(key, tuple) and level == 0:

try:
Expand Down
30 changes: 30 additions & 0 deletions pandas/tests/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -2300,6 +2300,36 @@ def f():
'lexsorted tuple len \(2\), lexsort depth \(0\)'):
df.loc[(slice(None), df.loc[:, ('a', 'bar')] > 5), :]

def test_multiindex_slicers_raise_key_error(self):

# GH6134
# Test that mi slicers raise a KeyError with the proper error message
# on unsorted indices regardless of the invocation method
iterables1 = [['a', 'b'], [2, 1]]
iterables2 = [['c', 'd'], [4, 3]]
rows = pd.MultiIndex.from_product(iterables1,
names=['row1', 'row2'])
columns = pd.MultiIndex.from_product(iterables2,
names=['col1', 'col2'])
df = pd.DataFrame(np.random.randn(4, 4), index=rows, columns=columns)

# In this example rows are not sorted at all,
# columns are sorted to the first level
self.assertEqual(df.index.lexsort_depth, 1)
self.assertEqual(df.columns.lexsort_depth, 0)

with tm.assertRaisesRegexp(
KeyError,
'MultiIndex Slicing requires the index to be fully '
'lexsorted tuple len \(\d\), lexsort depth \(\d\)'):
df.loc[('a', slice(None)), 'b']

with tm.assertRaisesRegexp(
KeyError,
'MultiIndex Slicing requires the index to be fully '
'lexsorted tuple len \(\d\), lexsort depth \(\d\)'):
df.loc['a', 'b']

def test_multiindex_slicers_non_unique(self):

# GH 7106
Expand Down