-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Series/DataFrame.rank returns empty object on failure #40418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@rhshadrach would like to take this one |
Sounds great @iamrajhans. It's not clear to me what the desired behavior here is. I would recommend looking into the behavior of |
@rhshadrach so I checked
but in the case of |
Thanks @iamrajhans, it seems to me there are two separate APIs for reducer vs transformer:
The only exception I see to this (but I may be missing others) is rank, which is a transformer whose API fits in the reducer camp. I don't see any advantage to have two different APIs for reducer vs transformer, it'd be nice if these were more common. One step in this direction that I think would be non-controversial would be to add a numeric_only argument to the rest of the transformers. Another question I have is whether the returning of an empty DataFrame (for both transformers and reducers) is desired behavior. I personally would expect/like a TypeError to be raised instead, but I wonder if there are other thoughts or aspects of pandas that depend on returning an empty object. |
No longer relevant now that |
This was noticed when working on #40288
Example:
The last two lines are an empty DataFrame and empty Series respectively.
The docstring for numeric_only says:
The current behavior with
numeric_only=None
(the default value) is:When numeric_only is True and the Series/DataFrame contain no numeric columns, rank then operates on an empty object returning an empty result.
This is causing issues when rank is used in transform lists and dictionaries. Namely, we'd like to have partial-failure for TypeErrors, but rank is returning an empty result instead of raising a TypeError. This could be special-cased, but it seems to me that returning an empty object (at least when numeric_only is not True) is undesirable itself.
Some options I see are:
The text was updated successfully, but these errors were encountered: