Skip to content

BUG: import of maybe_convert_indices in pandas.core.index.py, #10610 #10872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.17.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -785,4 +785,5 @@ Bug Fixes
- Bug in ``iloc`` allowing memory outside bounds of a Series to be accessed with negative integers (:issue:`10779`)
- Bug in ``read_msgpack`` where encoding is not respected (:issue:`10580`)
- Bug preventing access to the first index when using ``iloc`` with a list containing the appropriate negative integer (:issue:`10547`, :issue:`10779`)
- Bug in ``pd.Index`` when using passing a list of indices to a mixed-integer index (:issue:`10610`)
- Bug in ``TimedeltaIndex`` formatter causing error while trying to save ``DataFrame`` with ``TimedeltaIndex`` using ``to_csv`` (:issue:`10833`)
9 changes: 6 additions & 3 deletions pandas/core/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -973,9 +973,12 @@ def _convert_list_indexer(self, keyarr, kind=None):
indexer = self.get_indexer(keyarr)
if (indexer >= 0).all():
return indexer

from pandas.core.indexing import _maybe_convert_indices
return _maybe_convert_indices(indexer, len(self))
# missing values are flagged as -1 by get_indexer and negative indices are already
# converted to positive indices in the above if-statement, so the negative flags are changed to
# values outside the range of indices so as to trigger an IndexError in maybe_convert_indices
indexer[indexer < 0] = len(self)
from pandas.core.indexing import maybe_convert_indices
return maybe_convert_indices(indexer, len(self))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already done inside maybe_convert_indices why do you think its needed here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is necessary because negative indices passed into maybe_convert_indices are summed by the length of the array before the bounds checking is done. Because get_indexer assigns labels that are not found in the index with a value of -1, the last row or column is assigned to those positions instead of either NaNs or an exception being raised. With the line I added, the bounds checking in maybe_convert_indices will now throw an IndexError for out of bounds indices.


elif not self.inferred_type == 'integer':
return keyarr
Expand Down
5 changes: 5 additions & 0 deletions pandas/tests/test_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1729,6 +1729,11 @@ def test_equals_op_multiindex(self):
df.index == index_a
tm.assert_numpy_array_equal(index_a == mi3, np.array([False, False, False]))

def test_multitype_list_index_access(self):
df = pd.DataFrame(np.random.random((10, 5)), columns=["a"] + [20, 21, 22, 23])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number as a comment here

with self.assertRaises(IndexError):
vals = df[[22, 26, -8]]
self.assertEqual(df[21].shape[0], df.shape[0])

class TestCategoricalIndex(Base, tm.TestCase):
_holder = CategoricalIndex
Expand Down