-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Bug in loc not raising KeyError with MultiIndex containing no longer used levels #41358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not sure if this is relevant to this PR but wanted to raise to the point that For example if you wanted to reindex a MultiIndex and insert a single key, I previously raised an issue where I thought the opposite (basing my opinion solely on the case of consistency with single indexes) but revised later after I realised it could be unworkable. |
Yes, I think I remember this discussion, but this does not really apply here, since "b" is not in the Index at all, the same as
I think the discussion was about something like:
where both elements are in the index, but the combination of both is not? |
@@ -104,3 +104,14 @@ def test_reindex_non_unique(): | |||
msg = "cannot handle a non-unique multi-index!" | |||
with pytest.raises(ValueError, match=msg): | |||
a.reindex(new_idx) | |||
|
|||
|
|||
@pytest.mark.parametrize("values", [[["a"], ["x"]], [[], []]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was user facing in a Series.reindex right? can you add the user facing part of this test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but is essentially the same
But my point in this context is that if a user has some value he want to slice over that is not in df.index for a single index he has two options:
Since MultiIndexes cannot be reindexed without ambiguity or without necessarily expanding the number of level combinations to fill memory (in this case adding "c" would return (c,0) (c,1) (c,2) (c,3) which is exponentially bad) then the user is restricted, in the case of MultiIndexes to:
Which means that for each level the user must compare their indexer level args and exclude those that are not present. I think this level of restriction will be more problematic than simply not returning values that don't exist? |
Ah ok, did not remember this. I think this is a discussion we should have, but the pr is as you mentioned above probably not the right place, since this makes behavior more consistent and does not really change anything. Maybe ping on the other issue were the initial discussion was? |
stumbled across a bug in MultiIndex.reindex, which caused the ValueError from the op