-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: Adding examples to DataFrameGroupBy.rank #38972 #42402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
adbdcb1
ff71390
ab42746
76e4df1
1b2dde5
81000b7
98b90c6
ed08832
0032ae4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2641,7 +2641,6 @@ def cumcount(self, ascending: bool = True): | |
|
||
@final | ||
@Substitution(name="groupby") | ||
@Appender(_common_see_also) | ||
def rank( | ||
self, | ||
method: str = "average", | ||
|
@@ -2675,6 +2674,51 @@ def rank( | |
Returns | ||
------- | ||
DataFrame with ranking of values within each group | ||
|
||
See Also | ||
-------- | ||
Series.groupby : Apply a function groupby to a Series. | ||
DataFrame.groupby : Apply a function groupby | ||
to each row or column of a DataFrame. | ||
Series.rank : Apply a function rank to a Series. | ||
DataFrame.rank : Apply a function rank | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on the screenshot you posted, looks like this doesn't render as a link, so not that useful in current form. I think best to keep scope small and remove changes to the See Also (which could then be tackled as part of #42406 if you're interested!). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. reverted the change on |
||
to each row or column of a DataFrame. | ||
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame({'group': ['a', 'a', 'a', 'b', | ||
... 'a', 'b', 'b', 'b', 'b', 'a'], | ||
... 'value': [.2, .4, .2, 0.01, | ||
... .3, .11, .21, .4, .01, 0.2]}) | ||
>>> df | ||
group value | ||
0 a 0.20 | ||
1 a 0.40 | ||
2 a 0.20 | ||
3 b 0.01 | ||
4 a 0.30 | ||
5 b 0.11 | ||
6 b 0.21 | ||
7 b 0.40 | ||
8 b 0.01 | ||
9 a 0.20 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the example would be easier to see how different groups are treated if groups are contiguous, eg a, a, a, a...b, b, b, b There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also think it would be clearer to have fewer distinct values (and maybe use ints instead of floats, with values that are easy to tell at a glance what is smallest, largest, etc |
||
>>> df['average_rank'] = df.groupby('group')['value'].rank('average') | ||
>>> df['min_rank'] = df.groupby('group')['value'].rank('min') | ||
>>> df['max_rank'] = df.groupby('group')['value'].rank('max') | ||
>>> df['dense_rank'] = df.groupby('group')['value'].rank('dense') | ||
>>> df['first_rank'] = df.groupby('group')['value'].rank('first') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might be clearer as is, but could be written more concisely along the lines of
|
||
>>> df | ||
group value average_rank min_rank max_rank dense_rank first_rank | ||
0 a 0.20 2.0 1.0 3.0 1.0 1.0 | ||
1 a 0.40 5.0 5.0 5.0 3.0 5.0 | ||
2 a 0.20 2.0 1.0 3.0 1.0 2.0 | ||
3 b 0.01 1.5 1.0 2.0 1.0 1.0 | ||
4 a 0.30 4.0 4.0 4.0 2.0 4.0 | ||
5 b 0.11 3.0 3.0 3.0 2.0 3.0 | ||
6 b 0.21 4.0 4.0 4.0 3.0 4.0 | ||
7 b 0.40 5.0 5.0 5.0 4.0 5.0 | ||
8 b 0.01 1.5 1.0 2.0 1.0 2.0 | ||
9 a 0.20 2.0 1.0 3.0 1.0 3.0 | ||
""" | ||
if na_option not in {"keep", "top", "bottom"}: | ||
msg = "na_option must be one of 'keep', 'top', or 'bottom'" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that other groupby docs include Series/DataFrame .groupby in the See Also, but IMO they're not helpful (especially since they don't link to anything).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have opened #42406 for this