Skip to content

API: loc indexing with nested sequence raise on missing #41975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jbrockmendel opened this issue Jun 12, 2021 · 3 comments
Closed

API: loc indexing with nested sequence raise on missing #41975

jbrockmendel opened this issue Jun 12, 2021 · 3 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@jbrockmendel
Copy link
Member

I think the current behavior is left over from the long long ago.

mi = pd.MultiIndex.from_product([range(3), ["A", "B"]])
ser = pd.Series(len(mi), index=mi)

>>> ser.loc[1, ["A", "B", "not a key"]]
1  A    6
   B    6
dtype: int64

>>> ser.loc[[99999], ["A", "B"]]
Series([], dtype: int64)

>>> ser.loc[[99999]]
KeyError: '[99999] not in index'

I think we should deprecate this and eventually raise like we would with any other missing label.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 12, 2021
@attack68
Copy link
Contributor

@Pholf and I have an on going discussion re: this here. For what its worth I typed up how I believe many of the different cases should intersect, and pholf is rightfully challenging a few of those ideas.

Pertinent to your example, I propose:

ser.loc[1, ["A", "B", "not a key"]] would raise since "not a key" is not part of the level1_elements.

but

ser2 = ser.reindex(["A", "B", "not a key"], level=1)
ser2.loc[1, ["A", "B", "not a key"]]

would not raise since "not a key" is now an unused_level1_element, i.e.

>>> ser2.index.levels  
FrozenList([[0, 1, 2], ['A', 'B', 'not a key']])
>>> ser2
0  A    6
   B    6
1  A    6
   B    6
2  A    6
   B    6
dtype: int64

This allows for complex vector product or scalar broadcasting of level slices, without having to explicitly check if all the parts of your level slice will be accepted (and therefore be forced to remove any before using loc to avoid KeyErrors)

@jbrockmendel
Copy link
Member Author

@attack68 thanks for the link, ill work my way through the thread today. I take it this issue is redundant and can be closed?

@attack68
Copy link
Contributor

@attack68 thanks for the link, ill work my way through the thread today. I take it this issue is redundant and can be closed?

yes I believe it to be a duplicate, with this and other cases under consideration elsewhere,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants