Skip to content

DOC: Fix errors in pandas.Series.argmax #32019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Feb 26, 2020
23 changes: 22 additions & 1 deletion pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -929,11 +929,17 @@ def argmax(self, axis=None, skipna=True, *args, **kwargs):
"""
Return an ndarray of the maximum argument indexer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this sentence accurate? it looks like we return an int, not ndarray

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(also relevant for the Returns section)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this okay?

Return row position of the maximum argument indexer.

And also change for the Returns section:

Returns
-------
int
    row position of the maximum values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"maximum argument indexer" is awkward. how about "Returns int position of the largest value in the Series"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"maximum argument indexer" is awkward. how about "Returns int position of the largest value in the Series"

I think this one is easier to understand.


If multiple values equal the maximum, the first row label with that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the maximum is achieved in multiple locations, the first such location is returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should go with location instead of row position?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"row position" sounds good. Definitely not "row label".

value is returned.

Parameters
----------
axis : {None}
Dummy argument for consistency with Series.
skipna : bool, default True
Exclude NA/null values when showing the result.
*args, **kwargs
Additional arguments and keywords for compatibility with NumPy.

Returns
-------
Expand All @@ -942,7 +948,22 @@ def argmax(self, axis=None, skipna=True, *args, **kwargs):

See Also
--------
numpy.ndarray.argmax
numpy.ndarray.argmax : Returns the indices of the maximum values along an axis.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a pandas argmin that is worth adding here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Also
--------
numpy.ndarray.argmax : Returns the indices of the maximum values along an axis.
pandas.Series.argmin : Return a ndarray of the minimum argument indexer.

Would this be sufficient?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'm not sure if the pandas. prefix is needed, may be you can check other docstrings and see.

And for the descriptions, it could make sense to say something like "Equivalent method for numpy arrays.", "Same, but returning the minimum". I think for users could be easier to understand what those methods do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've noticed through scripts/validate_docstrings.py that I could safely omit pandas. to match the guidelines.

I agree with your proposed descriptions, as current description is redundant with each function description


Examples
--------
>>> s = pd.Series(data=[1, None, 5, 4, 5],
... index=['A', 'B', 'C', 'D', 'E'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think you can find some data that looks more "real". In Series.combine we've got a small Series with animal speeds that I think could be more appropriate to illustrate this: pd.Series({'falcon': 345.0, 'eagle': 200.0, 'duck': 30.0})

Also, may be worth adding a note saying that the 2 means that largest value is in the third element, since it's zero-indexed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm gonna use cereal calories dataset as an example and adding that note.

>>> s
A 1.0
B NaN
C 5.0
D 4.0
E 5.0
dtype: float64

>>> s.argmax()
2
"""
nv.validate_minmax_axis(axis)
nv.validate_argmax_with_skipna(skipna, args, kwargs)
Expand Down