-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Adding examples to DataFrameGroupBy.rank #38972 #42402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
debnathshoham
commented
Jul 6, 2021
•
edited
Loading
edited
- closes pandas.core.groupby.DataFrameGroupBy.rank() dense_rank does not work #38972
- tests added / passed
- Ensure all linting tests pass, see here for how to run them
- whatsnew entry
Hello @debnathshoham! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-07-11 19:41:09 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pr @debnathshoham! Generally looks like a great improvement, some comments. Can you also post a screenshot of the new built docs to make sure everything looks as expected (see https://pandas.pydata.org/docs/development/contributing_documentation.html#how-to-build-the-pandas-documentation)
pandas/core/groupby/groupby.py
Outdated
|
||
See Also | ||
-------- | ||
Series.groupby : Apply a function groupby to a Series. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that other groupby docs include Series/DataFrame .groupby in the See Also, but IMO they're not helpful (especially since they don't link to anything).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have opened #42406 for this
pandas/core/groupby/groupby.py
Outdated
6 b 0.21 | ||
7 b 0.40 | ||
8 b 0.01 | ||
9 a 0.20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the example would be easier to see how different groups are treated if groups are contiguous, eg a, a, a, a...b, b, b, b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also think it would be clearer to have fewer distinct values (and maybe use ints instead of floats, with values that are easy to tell at a glance what is smallest, largest, etc
pandas/core/groupby/groupby.py
Outdated
>>> df['min_rank'] = df.groupby('group')['value'].rank('min') | ||
>>> df['max_rank'] = df.groupby('group')['value'].rank('max') | ||
>>> df['dense_rank'] = df.groupby('group')['value'].rank('dense') | ||
>>> df['first_rank'] = df.groupby('group')['value'].rank('first') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be clearer as is, but could be written more concisely along the lines of
for method in ["average", ..., "first"]:
df[f"{method}_rank"] = df.groupby("group")["value"].rank(method)
Hi @mzeitlin11 , just made the suggested changes. I have also attached how it looks now. Please let me know if this looks fine. |
please review |
pandas/core/groupby/groupby.py
Outdated
DataFrame.groupby : Apply a function groupby | ||
to each row or column of a DataFrame. | ||
Series.rank : Apply a function rank to a Series. | ||
DataFrame.rank : Apply a function rank |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the screenshot you posted, looks like this doesn't render as a link, so not that useful in current form. I think best to keep scope small and remove changes to the See Also (which could then be tackled as part of #42406 if you're interested!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted the change on See Also
pandas/core/groupby/groupby.py
Outdated
-------- | ||
>>> df = pd.DataFrame({'group': ['a', 'a', 'a', 'a', | ||
... 'a', 'b', 'b', 'b', 'b', 'b'], | ||
... 'value': [2, 4, 2, 3, 5, 1, 2, 4, 1, 5]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting of this looks a bit awkward, can you use black
on this (or at least would be clearer if a's were all on same line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated with black
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for later review, |
@debnathshoham of note: The docstring validation is failing
|
thanks @mroeschke . I believe it would be fixed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @debnathshoham!
Thanks @debnathshoham |