-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Fix Series nsmallest and nlargest docstring/doctests #22731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
TomAugspurger
merged 5 commits into
pandas-dev:master
from
Moisan:docstring_nlargest_nsmallest
Sep 18, 2018
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
724610f
DOC: Fix Series nsmallest and nlargest docstring/doctests
1af1280
Fix a typo in nsmallest doctest
5c881f9
Add quick descriptions in the doctests of Series.nlargest and Series.…
5d6d5ed
Update nlargest and nsmallest docstring with backticks
7f311f9
Various changes to nlargest and nsmallest based on datapythonista review
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2743,17 +2743,20 @@ def nlargest(self, n=5, keep='first'): | |
|
||
Parameters | ||
---------- | ||
n : int | ||
Return this many descending sorted values | ||
keep : {'first', 'last'}, default 'first' | ||
Where there are duplicate values: | ||
- ``first`` : take the first occurrence. | ||
- ``last`` : take the last occurrence. | ||
n : int, default 5 | ||
Return this many descending sorted values. | ||
keep : {'first', 'last', 'all'}, default 'first' | ||
When there are duplicate values that cannot all fit in a | ||
Series of `n` elements: | ||
- ``first`` : take the first occurrences based on the index order | ||
- ``last`` : take the last occurrences based on the index order | ||
- ``all`` : keep all occurrences. This can result in a Series of | ||
size larger than `n`. | ||
|
||
Returns | ||
------- | ||
top_n : Series | ||
The n largest values in the Series, in sorted order | ||
Series | ||
The `n` largest values in the Series, sorted in decreasing order. | ||
|
||
Notes | ||
----- | ||
|
@@ -2762,23 +2765,70 @@ def nlargest(self, n=5, keep='first'): | |
|
||
See Also | ||
-------- | ||
Series.nsmallest | ||
Series.nsmallest: Get the `n` smallest elements. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd add |
||
Series.sort_values: Sort Series by values. | ||
Series.head: Return the first `n` rows. | ||
|
||
Examples | ||
-------- | ||
>>> s = pd.Series(np.random.randn(10**6)) | ||
>>> s.nlargest(10) # only sorts up to the N requested | ||
219921 4.644710 | ||
82124 4.608745 | ||
421689 4.564644 | ||
425277 4.447014 | ||
718691 4.414137 | ||
43154 4.403520 | ||
283187 4.313922 | ||
595519 4.273635 | ||
503969 4.250236 | ||
121637 4.240952 | ||
dtype: float64 | ||
>>> countries_population = {"Italy": 59000000, "France": 65000000, | ||
... "Malta": 434000, "Maldives": 434000, | ||
... "Brunei": 434000, "Iceland": 337000, | ||
... "Nauru": 11300, "Tuvalu": 11300, | ||
... "Anguilla": 11300, "Monserat": 5200} | ||
>>> s = pd.Series(countries_population) | ||
>>> s | ||
Italy 59000000 | ||
France 65000000 | ||
Malta 434000 | ||
Maldives 434000 | ||
Brunei 434000 | ||
Iceland 337000 | ||
Nauru 11300 | ||
Tuvalu 11300 | ||
Anguilla 11300 | ||
Monserat 5200 | ||
dtype: int64 | ||
|
||
The `n` largest elements where ``n=5`` by default. | ||
|
||
>>> s.nlargest() | ||
France 65000000 | ||
Italy 59000000 | ||
Malta 434000 | ||
Maldives 434000 | ||
Brunei 434000 | ||
dtype: int64 | ||
|
||
The `n` largest elements where ``n=3``. Default `keep` value is 'first' | ||
so Malta will be kept. | ||
|
||
>>> s.nlargest(3) | ||
France 65000000 | ||
Italy 59000000 | ||
Malta 434000 | ||
dtype: int64 | ||
|
||
The `n` largest elements where ``n=3`` and keeping the last duplicates. | ||
Brunei will be kept since it is the last with value 434000 based on | ||
the index order. | ||
|
||
>>> s.nlargest(3, keep='last') | ||
France 65000000 | ||
Italy 59000000 | ||
Brunei 434000 | ||
dtype: int64 | ||
|
||
The `n` largest elements where ``n=3`` with all duplicates kept. Note | ||
that the returned Series has five elements due to the three duplicates. | ||
|
||
>>> s.nlargest(3, keep='all') | ||
France 65000000 | ||
Italy 59000000 | ||
Malta 434000 | ||
Maldives 434000 | ||
Brunei 434000 | ||
dtype: int64 | ||
""" | ||
return algorithms.SelectNSeries(self, n=n, keep=keep).nlargest() | ||
|
||
|
@@ -2788,17 +2838,20 @@ def nsmallest(self, n=5, keep='first'): | |
|
||
Parameters | ||
---------- | ||
n : int | ||
Return this many ascending sorted values | ||
keep : {'first', 'last'}, default 'first' | ||
Where there are duplicate values: | ||
- ``first`` : take the first occurrence. | ||
- ``last`` : take the last occurrence. | ||
n : int, default 5 | ||
Return this many ascending sorted values. | ||
keep : {'first', 'last', 'all'}, default 'first' | ||
When there are duplicate values that cannot all fit in a | ||
Series of `n` elements: | ||
- ``first`` : take the first occurrences based on the index order | ||
- ``last`` : take the last occurrences based on the index order | ||
- ``all`` : keep all occurrences. This can result in a Series of | ||
size larger than `n`. | ||
|
||
Returns | ||
------- | ||
bottom_n : Series | ||
The n smallest values in the Series, in sorted order | ||
Series | ||
The `n` smallest values in the Series, sorted in increasing order. | ||
|
||
Notes | ||
----- | ||
|
@@ -2807,23 +2860,69 @@ def nsmallest(self, n=5, keep='first'): | |
|
||
See Also | ||
-------- | ||
Series.nlargest | ||
Series.nlargest: Get the `n` largest elements. | ||
Series.sort_values: Sort Series by values. | ||
Series.head: Return the first `n` rows. | ||
|
||
Examples | ||
-------- | ||
>>> s = pd.Series(np.random.randn(10**6)) | ||
>>> s.nsmallest(10) # only sorts up to the N requested | ||
288532 -4.954580 | ||
732345 -4.835960 | ||
64803 -4.812550 | ||
446457 -4.609998 | ||
501225 -4.483945 | ||
669476 -4.472935 | ||
973615 -4.401699 | ||
621279 -4.355126 | ||
773916 -4.347355 | ||
359919 -4.331927 | ||
dtype: float64 | ||
>>> countries_population = {"Italy": 59000000, "France": 65000000, | ||
... "Brunei": 434000, "Malta": 434000, | ||
... "Maldives": 434000, "Iceland": 337000, | ||
... "Nauru": 11300, "Tuvalu": 11300, | ||
... "Anguilla": 11300, "Monserat": 5200} | ||
>>> s = pd.Series(countries_population) | ||
>>> s | ||
Italy 59000000 | ||
France 65000000 | ||
Brunei 434000 | ||
Malta 434000 | ||
Maldives 434000 | ||
Iceland 337000 | ||
Nauru 11300 | ||
Tuvalu 11300 | ||
Anguilla 11300 | ||
Monserat 5200 | ||
dtype: int64 | ||
|
||
The `n` largest elements where ``n=5`` by default. | ||
|
||
>>> s.nsmallest() | ||
Monserat 5200 | ||
Nauru 11300 | ||
Tuvalu 11300 | ||
Anguilla 11300 | ||
Iceland 337000 | ||
dtype: int64 | ||
|
||
The `n` smallest elements where ``n=3``. Default `keep` value is | ||
'first' so Nauru and Tuvalu will be kept. | ||
|
||
>>> s.nsmallest(3) | ||
Monserat 5200 | ||
Nauru 11300 | ||
Tuvalu 11300 | ||
dtype: int64 | ||
|
||
The `n` smallest elements where ``n=3`` and keeping the last | ||
duplicates. Anguilla and Tuvalu will be kept since they are the last | ||
with value 11300 based on the index order. | ||
|
||
>>> s.nsmallest(3, keep='last') | ||
Monserat 5200 | ||
Anguilla 11300 | ||
Tuvalu 11300 | ||
dtype: int64 | ||
|
||
The `n` smallest elements where ``n=3`` with all duplicates kept. Note | ||
that the returned Series has four elements due to the three duplicates. | ||
|
||
>>> s.nsmallest(3, keep='all') | ||
Monserat 5200 | ||
Nauru 11300 | ||
Tuvalu 11300 | ||
Anguilla 11300 | ||
dtype: int64 | ||
""" | ||
return algorithms.SelectNSeries(self, n=n, keep=keep).nsmallest() | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the period here on the last bullet required to pass the docstring validation as-is? Shouldn't be necessary but if that's the intent here just something we should address separately @datapythonista
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm that the validation fails if the last period is not present.