Skip to content

Commit 33e05d7

Browse files
jayanthkaturijreback
authored andcommitted
BUG: sort_index throws IndexError for some permutations (pandas-dev#26053) (pandas-dev#26054)
1 parent 88062f7 commit 33e05d7

File tree

3 files changed

+31
-6
lines changed

3 files changed

+31
-6
lines changed

doc/source/whatsnew/v0.25.0.rst

+2
Original file line numberDiff line numberDiff line change
@@ -390,6 +390,7 @@ Groupby/Resample/Rolling
390390
- Bug in :meth:`pandas.core.groupby.GroupBy.idxmax` and :meth:`pandas.core.groupby.GroupBy.idxmin` with datetime column would return incorrect dtype (:issue:`25444`, :issue:`15306`)
391391
- Bug in :meth:`pandas.core.groupby.GroupBy.cumsum`, :meth:`pandas.core.groupby.GroupBy.cumprod`, :meth:`pandas.core.groupby.GroupBy.cummin` and :meth:`pandas.core.groupby.GroupBy.cummax` with categorical column having absent categories, would return incorrect result or segfault (:issue:`16771`)
392392

393+
393394
Reshaping
394395
^^^^^^^^^
395396

@@ -402,6 +403,7 @@ Reshaping
402403
- Bug in :func:`merge` where merging with equivalent Categorical dtypes was raising an error (:issue:`22501`)
403404
- Bug in :class:`DataFrame` constructor when passing non-empty tuples would cause a segmentation fault (:issue:`25691`)
404405
- Bug in :func:`pandas.cut` where large bins could incorrectly raise an error due to an integer overflow (:issue:`26045`)
406+
- Bug in :func:`DataFrame.sort_index` where an error is thrown when a multi-indexed DataFrame is sorted on all levels with the initial level sorted last (:issue:`26053`)
405407
- Bug in :meth:`Series.nlargest` treats ``True`` as smaller than ``False`` (:issue:`26154`)
406408

407409
Sparse

pandas/core/indexes/multi.py

+8-2
Original file line numberDiff line numberDiff line change
@@ -2084,8 +2084,14 @@ def sortlevel(self, level=0, ascending=True, sort_remaining=True):
20842084
shape = list(self.levshape)
20852085

20862086
# partition codes and shape
2087-
primary = tuple(codes.pop(lev - i) for i, lev in enumerate(level))
2088-
primshp = tuple(shape.pop(lev - i) for i, lev in enumerate(level))
2087+
primary = tuple(codes[lev] for lev in level)
2088+
primshp = tuple(shape[lev] for lev in level)
2089+
2090+
# Reverse sorted to retain the order of
2091+
# smaller indices that needs to be removed
2092+
for lev in sorted(level, reverse=True):
2093+
codes.pop(lev)
2094+
shape.pop(lev)
20892095

20902096
if sort_remaining:
20912097
primary += primary + tuple(codes)

pandas/tests/frame/test_sorting.py

+21-4
Original file line numberDiff line numberDiff line change
@@ -496,11 +496,28 @@ def test_sort_index_duplicates(self):
496496
def test_sort_index_level(self):
497497
mi = MultiIndex.from_tuples([[1, 1, 3], [1, 1, 1]], names=list('ABC'))
498498
df = DataFrame([[1, 2], [3, 4]], mi)
499-
res = df.sort_index(level='A', sort_remaining=False)
500-
assert_frame_equal(df, res)
501499

502-
res = df.sort_index(level=['A', 'B'], sort_remaining=False)
503-
assert_frame_equal(df, res)
500+
result = df.sort_index(level='A', sort_remaining=False)
501+
expected = df
502+
assert_frame_equal(result, expected)
503+
504+
result = df.sort_index(level=['A', 'B'], sort_remaining=False)
505+
expected = df
506+
assert_frame_equal(result, expected)
507+
508+
# Error thrown by sort_index when
509+
# first index is sorted last (#26053)
510+
result = df.sort_index(level=['C', 'B', 'A'])
511+
expected = df.iloc[[1, 0]]
512+
assert_frame_equal(result, expected)
513+
514+
result = df.sort_index(level=['B', 'C', 'A'])
515+
expected = df.iloc[[1, 0]]
516+
assert_frame_equal(result, expected)
517+
518+
result = df.sort_index(level=['C', 'A'])
519+
expected = df.iloc[[1, 0]]
520+
assert_frame_equal(result, expected)
504521

505522
def test_sort_index_categorical_index(self):
506523

0 commit comments

Comments
 (0)