-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame/Series.loc improperly allows lookups of boolean labels/slices #20432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
yeah I agree slice should certainly raise, prob But I think making all of these TypeError is prob reasonable. I wouldn't cast. |
Looks like these raise sensible errors on master now. Could use a test
|
I'm going to give this a shot! 😄 |
take |
Any suggestions on how to go about this? @mroeschke @MarcoGorelli I noticed some files related to code - pandas/core/indexing.py tests: I still have to understand the existing tests and then write the new ones. I'm pretty new to Python 😅 Let me know if this is a big issue to pickup. I can pickup something smaller in that case. PS: I'm still stuck at my machine setup - I tried VS Code Remote Containers setup but it didn't work out, I think it's because it took up lots of RAM and was killed because of Out Of Memory, as I have set some limits in my Docker Desktop, like 1GB RAM. I plan to setup the whole thing in my local. Is that okay? I hope it doesn't affect any of my other python related software / tools I use, given it is more sandboxed with virtualenv (I'm assuming. Python newbie here) |
Yup, that's what I've done, I have a |
I have setup the environment. I tried to run the tests, but there were tons, so stopped it in between. Can someone help me understand what kind of test is needed here? Are we looking for tests that assert for In [40]: In [2]: s = pd.Series(list('abcde'), pd.timedelta_range(0, 4, freq='ns'))
...:
...: In [3]: s.loc[True]
KeyError: True
In [41]: In [4]: s.loc[False:True]
TypeError: Unexpected type for 'value': <class 'bool'> |
I guess I'll have to first learn a bit more about the existing tests. I'll do that and get back here. I'll look for any suggestions over here, please do post if you have any |
Since I'm not working on this anymore, I'll leave it for others to pick it up. 🙈 |
take |
Seems like this is still not working for >>> import pandas as pd
>>> df = pd.DataFrame(range(4))
>>> df.index = pd.interval_range(0, periods=4)
>>> df.loc[True]
0 0
Name: (0, 1], dtype: int64 Same as for the regular df = pd.DataFrame(range(4))
df.index = pd.Index([0, 1, 2, 3], dtype=object)
df.loc[True]
0 1
Name: 1, dtype: int64 Do these cases need seperate issues? |
I don't think so, provided the same fix can handle both cases. Could open another issue if you find out one or the other needs some special attention for some reason. |
Code Sample, a copy-pastable example if possible
Basic example of the issue, specific to
TimedeltaIndex
, xref #20408 (comment)Indexing with both boolean labels and slices was successful, which doesn't seem right.
I investigated this same behavior across various index types for both
Series
andDataFrame
, and produced the summary below.Summary
Code to produce summary
Expected Output
I'd generally expect all of these operations to raise a
KeyError
, which a couple potential exceptions:object
dtypeIndex
?The text was updated successfully, but these errors were encountered: