Skip to content

BUG: fix .loc.__setitem__ not raising when using too many indexers #44656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jan 30, 2022

Conversation

DriesSchaumont
Copy link
Member

@DriesSchaumont DriesSchaumont commented Nov 28, 2021

df = DataFrame({"a": [10]})
msg = "Too many indexers"
with pytest.raises(IndexingError, match=msg):
df["a"].loc[0, 0] = 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you construct the series explicitly?

@jreback jreback added Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves Bug labels Nov 28, 2021
@@ -2715,6 +2715,13 @@ def test_loc_getitem_multiindex_tuple_level():
assert result2 == 6


def test_loc_setitem_indexer_length():
df = DataFrame({"a": [10]})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you cover all of the tests in the OP (parameterize if possible)

)
def test_iloc_setitem_indexer_length(self, ser, keys):
# GH#13831
with pytest.raises(IndexError, match="too many indices for array"):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the difference in error between ser.iloc[keys] = 1000 and ser.iloc[keys]. IndexError is the result of the fact that we let numpy handle this, while pandas raises IndexingError. I was wondering if this should be made consistent? Us catching and re-raising from numpy seems ugly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @jbrockmendel

but @DriesSchaumont I think we would want to catch that case & re-raise appropriately

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC a while back we removed some checks (for perf) and let numpy exceptions surface.

Not knowing anything about the specific use case, in cases where we have a choice between raising IndexError vs IndexingError (i.e. either would be a reasonable choice) I'd rather raise IndexError.

@jreback jreback added this to the 1.4 milestone Dec 1, 2021
@jreback
Copy link
Contributor

jreback commented Dec 1, 2021

@phofl @jbrockmendel good here? (ex the question)

@jreback
Copy link
Contributor

jreback commented Dec 5, 2021

cc @phofl @jbrockmendel if any comments

Copy link
Member

@phofl phofl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments

ser.loc[keys] = 1000

with pytest.raises(IndexingError, match=msg):
ser.loc[keys]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use indexer_sli fixture to share this test with the iloc one i think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, also moved the tests.

@jreback
Copy link
Contributor

jreback commented Dec 18, 2021

@phofl @jbrockmendel good here?


def test_ser_list_indexer_exceeds_dimensions(indexer_sli):
# GH#13831
if indexer_sli == tm.setitem:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could make an indexer_li fixture

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@jreback
Copy link
Contributor

jreback commented Dec 24, 2021

@phofl @jbrockmendel ok here?

@jreback
Copy link
Contributor

jreback commented Jan 8, 2022

@DriesSchaumont can you merge master

@DriesSchaumont
Copy link
Member Author

Should I move this to 1.5?

@@ -809,6 +809,7 @@ Indexing
- Bug in :meth:`DataFrame.loc.__setitem__` changing dtype when indexer was completely ``False`` (:issue:`37550`)
- Bug in :meth:`IntervalIndex.get_indexer_non_unique` returning boolean mask instead of array of integers for a non unique and non monotonic index (:issue:`44084`)
- Bug in :meth:`IntervalIndex.get_indexer_non_unique` not handling targets of ``dtype`` 'object' with NaNs correctly (:issue:`44482`)
- Bug in :meth:`Series.loc.__setitem__` and :meth:`Series.loc.__getitem__` not raising when using multiple keys without using a :class:`MultiIndex` (:issue:`13831`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah let's move this to 1.5

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I moved whatsnew

@jreback jreback modified the milestones: 1.4, 1.5 Jan 10, 2022
@jreback
Copy link
Contributor

jreback commented Jan 16, 2022

@DriesSchaumont if you can merge master and address comments

@DriesSchaumont
Copy link
Member Author

@jreback @jbrockmendel merged master, please let me know which/if questions still need answering.

@jreback jreback merged commit 275b187 into pandas-dev:main Jan 30, 2022
@jreback
Copy link
Contributor

jreback commented Jan 30, 2022

thanks @DriesSchaumont very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Series.loc/iloc[x, y] does not raise exception
4 participants