-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: No numeric_only argument for pandas.core.groupby.GroupBy.rank() #44438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think I understood what you want but to be sure could you provide sample data set and what you are trying to accomplish? Perhaps provide 2 tables (in and out) You are missing 1 step. You should specify what to do with the grouped values, should it take max value out of each groupby or min value etc.
code below for your case:
Hope this clarifies things a bit P.S. you can also specify |
Basically I want to know how a team ranked for each statistic in a given season.
Then going out I would like the table to look like this:
So the 1947 Braves finished first in W, IP, and WAR. My plan is to make a new column which is the average of all the other rankings and then sort by the average to see who was the most "dominant" in those areas. Would doing I did find a workaround by doing ranked = stats.set_index(['yearID', 'teamID'])
ranked = ranked.groupby('yearID').rank(ascending=False).reset_index() and that yields the same as above. |
I see. Your workaround looks fine. Could I get assigned for this issue after someone else also confirms it? |
Okay, thank you for the help! |
pandas.core.groupby.GroupBy.rank does not have a numeric_only argument like DataFrame.rank()
I have a DataFrame with several statistics from baseball teams across different years. I want to rank each team in each statistic, grouped by season. Every column is numeric, except for the teamID column which is an object type containing the names of each team as a string. My code looks something like this
Since it is GroupBy.rank() I can't pass the
numeric_only
argument and that means I have to reassignranked['teamID']
to the original column. I also cannot dobecause that would give everybody a rank of 1.
Is there a reason that
numeric_only
is included in DataFrame.rank() but not GroupBy.rank(). Could it be added to GroupBy.rank()?Then I could code it like this, which would be easier.
I am just a hobbyist and so I don't know much about the implementation of these methods which means there may be something I am completely ignoring, or another more efficient way to do it. If so I would appreciate some enlightenment about what I am missing. Thanks!
The text was updated successfully, but these errors were encountered: