Skip to content

DOC: Adding examples to DataFrameGroupBy.rank #38972 #42402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 12, 2021

Conversation

debnathshoham
Copy link
Member

@debnathshoham debnathshoham commented Jul 6, 2021

@pep8speaks
Copy link

pep8speaks commented Jul 6, 2021

Hello @debnathshoham! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-07-11 19:41:09 UTC

Copy link
Member

@mzeitlin11 mzeitlin11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pr @debnathshoham! Generally looks like a great improvement, some comments. Can you also post a screenshot of the new built docs to make sure everything looks as expected (see https://pandas.pydata.org/docs/development/contributing_documentation.html#how-to-build-the-pandas-documentation)


See Also
--------
Series.groupby : Apply a function groupby to a Series.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that other groupby docs include Series/DataFrame .groupby in the See Also, but IMO they're not helpful (especially since they don't link to anything).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have opened #42406 for this

6 b 0.21
7 b 0.40
8 b 0.01
9 a 0.20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the example would be easier to see how different groups are treated if groups are contiguous, eg a, a, a, a...b, b, b, b

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also think it would be clearer to have fewer distinct values (and maybe use ints instead of floats, with values that are easy to tell at a glance what is smallest, largest, etc

>>> df['min_rank'] = df.groupby('group')['value'].rank('min')
>>> df['max_rank'] = df.groupby('group')['value'].rank('max')
>>> df['dense_rank'] = df.groupby('group')['value'].rank('dense')
>>> df['first_rank'] = df.groupby('group')['value'].rank('first')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be clearer as is, but could be written more concisely along the lines of

for method in ["average", ..., "first"]:
   df[f"{method}_rank"] = df.groupby("group")["value"].rank(method)

@mzeitlin11 mzeitlin11 added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Docs Groupby labels Jul 6, 2021
@debnathshoham
Copy link
Member Author

Hi @mzeitlin11 , just made the suggested changes. I have also attached how it looks now. Please let me know if this looks fine.
Screenshot 2021-07-06 at 9 32 34 PM

@debnathshoham
Copy link
Member Author

please review

DataFrame.groupby : Apply a function groupby
to each row or column of a DataFrame.
Series.rank : Apply a function rank to a Series.
DataFrame.rank : Apply a function rank
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the screenshot you posted, looks like this doesn't render as a link, so not that useful in current form. I think best to keep scope small and remove changes to the See Also (which could then be tackled as part of #42406 if you're interested!).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted the change on See Also

--------
>>> df = pd.DataFrame({'group': ['a', 'a', 'a', 'a',
... 'a', 'b', 'b', 'b', 'b', 'b'],
... 'value': [2, 4, 2, 3, 5, 1, 2, 4, 1, 5]})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting of this looks a bit awkward, can you use black on this (or at least would be clearer if a's were all on same line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated with black

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how it looks now.

Screenshot 2021-07-11 at 12 46 58 PM

@mzeitlin11
Copy link
Member

mzeitlin11 commented Jul 9, 2021

please review

Sorry for later review, pandas is volunteer, so review may sometimes take a while :) Left some small comments, otherwise LGTM!

@mroeschke
Copy link
Member

@debnathshoham of note: The docstring validation is failing

Error: /home/runner/work/pandas/pandas/pandas/core/groupby/groupby.py:2638:GL07:pandas.core.groupby.DataFrameGroupBy.rank:Sections are in the wrong order. Correct order is: Parameters, Returns, See Also, Examples

@debnathshoham
Copy link
Member Author

thanks @mroeschke . I believe it would be fixed now.

@debnathshoham debnathshoham requested a review from mzeitlin11 July 12, 2021 18:04
Copy link
Member

@mzeitlin11 mzeitlin11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @debnathshoham!

@mroeschke mroeschke added this to the 1.4 milestone Jul 12, 2021
@mroeschke mroeschke merged commit 4c90215 into pandas-dev:master Jul 12, 2021
@mroeschke
Copy link
Member

Thanks @debnathshoham

@debnathshoham debnathshoham deleted the groupbyrank-doc branch July 12, 2021 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Docs Groupby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pandas.core.groupby.DataFrameGroupBy.rank() dense_rank does not work
4 participants