Skip to content

BUG: import of maybe_convert_indices in pandas.core.index.py, #10610 #10872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.17.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -701,3 +701,4 @@ Bug Fixes
- Bug in ``iloc`` allowing memory outside bounds of a Series to be accessed with negative integers (:issue:`10779`)
- Bug in ``read_msgpack`` where encoding is not respected (:issue:`10580`)
- Bug preventing access to the first index when using ``iloc`` with a list containing the appropriate negative integer (:issue:`10547`, :issue:`10779`)
- Bug in ``pd.Index`` when using passing a list of indices to a mixed-integer index (:issue:`10610`)
6 changes: 4 additions & 2 deletions pandas/core/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -971,11 +971,13 @@ def _convert_list_indexer(self, keyarr, kind=None):

if self.inferred_type == 'mixed-integer':
indexer = self.get_indexer(keyarr)
if not np.all(np.in1d(indexer, self.values)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this line necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the line, when passed a list containing index values, the function would output the data associated with the nearest integer index value. For other cases where an invalid index is passed to a DataFrame, an IndexError would be raised, so I made sure to explicitly check for that condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your test doesn't hit that line though it goes thru
can u make a test that is explicitly caught there?

raise IndexError("At least one item not found in index.")
if (indexer >= 0).all():
return indexer

from pandas.core.indexing import _maybe_convert_indices
return _maybe_convert_indices(indexer, len(self))
from pandas.core.indexing import maybe_convert_indices
return maybe_convert_indices(indexer, len(self))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already done inside maybe_convert_indices why do you think its needed here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is necessary because negative indices passed into maybe_convert_indices are summed by the length of the array before the bounds checking is done. Because get_indexer assigns labels that are not found in the index with a value of -1, the last row or column is assigned to those positions instead of either NaNs or an exception being raised. With the line I added, the bounds checking in maybe_convert_indices will now throw an IndexError for out of bounds indices.


elif not self.inferred_type == 'integer':
return keyarr
Expand Down
5 changes: 5 additions & 0 deletions pandas/tests/test_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1723,6 +1723,11 @@ def test_equals_op_multiindex(self):
df.index == index_a
tm.assert_numpy_array_equal(index_a == mi3, np.array([False, False, False]))

def test_multitype_list_index_access(self):
df = pd.DataFrame(np.random.random((10, 5)), columns=["a"] + [20, 21, 22, 23])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number as a comment here

with self.assertRaises(IndexError):
vals = df[[22, 26, -8]]
self.assertEqual(df[21].shape[0], df.shape[0])

class TestCategoricalIndex(Base, tm.TestCase):
_holder = CategoricalIndex
Expand Down