-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: df.loc[[x], :] fails if df has zero rows #41170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, thanks for your report. I think both cases should raise a KeyError not return an empty DataFrame. But we are not as consistent as we would like with MultiIndexes. |
We are running through pandas/pandas/core/indexing.py Line 1269 in 88ce933
|
hmm yeah i think we did remove that (or thought) |
Thanks, will check what impact this would have, then we can decide if we want to remove now or with 2.0 |
AFAICT raise_missing only affects the message that goes with the KeyError, not whether an exception is raised at all. Am I reading that wrong? |
Yes you are right. I've only read the docstring, which says
so I figured this would affect the actual error thrown. Should we remove this keyword and always raise with the same message? If we want to wait with this, I would update the docstring to reflect the actual behavior |
No real opinion here, will trust your judgement. I'll be happy with just about anything that simplifies the code. |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Complete Output of Code Sample
Problem description
In both cases the
df.loc[df.A...]
returns a dataframe that doesn't contain any rows with an index value of "b".Accordingly in the first case the result of
.loc[["b"], :]
is an empty dataframe, but in the second case a ValueError is raised. The difference between the cases is that in the first casedf.loc[df.A...]
returns a dataframe with some rows (though none with index value "b"), while in the second casedf.loc[df.A...]
returns a dataframe with zero rows.I think that shouldn't make a difference.
In the original code
.loc[df.A...]
and.loc[["b"], :]
are not directly combined in one expression, but the first one creates a selection of rows of the dataframe, this selection is processed further, and during this another expression uses the second.loc
.The traceback looks very similar to the one in #40235. Maybe both bugs have a common root cause.
Expected Output
df.loc[df.A < 10, :].loc[["b"], :]
returns an empty dataframe likedf.loc[df.A < 30, :].loc[["b"], :]
does.Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 2cb9652
python : 3.9.4.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.18362
machine : AMD64
...
pandas : 1.2.4
numpy : 1.20.2
pytz : 2021.1
dateutil : 2.8.1
...
The text was updated successfully, but these errors were encountered: