Skip to content

Commit 94c8c94

Browse files
EdAbatiWillAyd
authored andcommitted
DOC: Improvement docstring of DataFrame.rank() (#25328)
1 parent 7120725 commit 94c8c94

File tree

1 file changed

+70
-19
lines changed

1 file changed

+70
-19
lines changed

pandas/core/generic.py

+70-19
Original file line numberDiff line numberDiff line change
@@ -8334,34 +8334,85 @@ def last(self, offset):
83348334
def rank(self, axis=0, method='average', numeric_only=None,
83358335
na_option='keep', ascending=True, pct=False):
83368336
"""
8337-
Compute numerical data ranks (1 through n) along axis. Equal values are
8338-
assigned a rank that is the average of the ranks of those values.
8337+
Compute numerical data ranks (1 through n) along axis.
8338+
8339+
By default, equal values are assigned a rank that is the average of the
8340+
ranks of those values.
83398341
83408342
Parameters
83418343
----------
83428344
axis : {0 or 'index', 1 or 'columns'}, default 0
8343-
index to direct ranking
8344-
method : {'average', 'min', 'max', 'first', 'dense'}
8345-
* average: average rank of group
8346-
* min: lowest rank in group
8347-
* max: highest rank in group
8345+
Index to direct ranking.
8346+
method : {'average', 'min', 'max', 'first', 'dense'}, default 'average'
8347+
How to rank the group of records that have the same value
8348+
(i.e. ties):
8349+
8350+
* average: average rank of the group
8351+
* min: lowest rank in the group
8352+
* max: highest rank in the group
83488353
* first: ranks assigned in order they appear in the array
83498354
* dense: like 'min', but rank always increases by 1 between groups
8350-
numeric_only : boolean, default None
8351-
Include only float, int, boolean data. Valid only for DataFrame or
8352-
Panel objects
8353-
na_option : {'keep', 'top', 'bottom'}
8354-
* keep: leave NA values where they are
8355-
* top: smallest rank if ascending
8356-
* bottom: smallest rank if descending
8357-
ascending : boolean, default True
8358-
False for ranks by high (1) to low (N)
8359-
pct : boolean, default False
8360-
Computes percentage rank of data
8355+
numeric_only : bool, optional
8356+
For DataFrame objects, rank only numeric columns if set to True.
8357+
na_option : {'keep', 'top', 'bottom'}, default 'keep'
8358+
How to rank NaN values:
8359+
8360+
* keep: assign NaN rank to NaN values
8361+
* top: assign smallest rank to NaN values if ascending
8362+
* bottom: assign highest rank to NaN values if ascending
8363+
ascending : bool, default True
8364+
Whether or not the elements should be ranked in ascending order.
8365+
pct : bool, default False
8366+
Whether or not to display the returned rankings in percentile
8367+
form.
83618368
83628369
Returns
83638370
-------
8364-
ranks : same type as caller
8371+
same type as caller
8372+
Return a Series or DataFrame with data ranks as values.
8373+
8374+
See Also
8375+
--------
8376+
core.groupby.GroupBy.rank : Rank of values within each group.
8377+
8378+
Examples
8379+
--------
8380+
8381+
>>> df = pd.DataFrame(data={'Animal': ['cat', 'penguin', 'dog',
8382+
... 'spider', 'snake'],
8383+
... 'Number_legs': [4, 2, 4, 8, np.nan]})
8384+
>>> df
8385+
Animal Number_legs
8386+
0 cat 4.0
8387+
1 penguin 2.0
8388+
2 dog 4.0
8389+
3 spider 8.0
8390+
4 snake NaN
8391+
8392+
The following example shows how the method behaves with the above
8393+
parameters:
8394+
8395+
* default_rank: this is the default behaviour obtained without using
8396+
any parameter.
8397+
* max_rank: setting ``method = 'max'`` the records that have the
8398+
same values are ranked using the highest rank (e.g.: since 'cat'
8399+
and 'dog' are both in the 2nd and 3rd position, rank 3 is assigned.)
8400+
* NA_bottom: choosing ``na_option = 'bottom'``, if there are records
8401+
with NaN values they are placed at the bottom of the ranking.
8402+
* pct_rank: when setting ``pct = True``, the ranking is expressed as
8403+
percentile rank.
8404+
8405+
>>> df['default_rank'] = df['Number_legs'].rank()
8406+
>>> df['max_rank'] = df['Number_legs'].rank(method='max')
8407+
>>> df['NA_bottom'] = df['Number_legs'].rank(na_option='bottom')
8408+
>>> df['pct_rank'] = df['Number_legs'].rank(pct=True)
8409+
>>> df
8410+
Animal Number_legs default_rank max_rank NA_bottom pct_rank
8411+
0 cat 4.0 2.5 3.0 2.5 0.625
8412+
1 penguin 2.0 1.0 1.0 1.0 0.250
8413+
2 dog 4.0 2.5 3.0 2.5 0.625
8414+
3 spider 8.0 4.0 4.0 4.0 1.000
8415+
4 snake NaN NaN NaN 5.0 NaN
83658416
"""
83668417
axis = self._get_axis_number(axis)
83678418

0 commit comments

Comments
 (0)