Skip to content

DOC: Enforce Numpy Docstring Validation for pandas.DataFrame.sum #58565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
-i "pandas.DataFrame.min RT03" \
-i "pandas.DataFrame.plot PR02,SA01" \
-i "pandas.DataFrame.std PR01,RT03,SA01" \
-i "pandas.DataFrame.sum RT03" \
-i "pandas.DataFrame.swaplevel SA01" \
-i "pandas.DataFrame.to_markdown SA01" \
-i "pandas.Grouper PR02" \
Expand Down
82 changes: 81 additions & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -11709,7 +11709,6 @@ def max(
return result

@deprecate_nonkeyword_arguments(version="3.0", allowed_args=["self"], name="sum")
@doc(make_doc("sum", ndim=2))
def sum(
self,
axis: Axis | None = 0,
Expand All @@ -11718,6 +11717,87 @@ def sum(
min_count: int = 0,
**kwargs,
) -> Series:
"""
Return the sum of the values over the requested axis.

This is equivalent to the method ``numpy.sum``.

Parameters
----------
axis : {index (0), columns (1)}
Axis for the function to be applied on.
For `Series` this parameter is unused and defaults to 0.

.. warning::

The behavior of DataFrame.sum with ``axis=None`` is deprecated,
in a future version this will reduce over both axes and return a scalar
To retain the old behavior, pass axis=0 (or do not pass axis).

.. versionadded:: 2.0.0

skipna : bool, default True
Exclude NA/null values when computing the result.
numeric_only : bool, default False
Include only float, int, boolean columns. Not implemented for Series.
min_count : int, default 0
The required number of valid values to perform the operation. If fewer than
``min_count`` non-NA values are present the result will be NA.
**kwargs
Additional keyword arguments to be passed to the function.

Returns
-------
Series or scalar
Sum over requested axis.

See Also
--------
Series.sum : Return the sum over Series values.
DataFrame.mean : Return the mean of the values over the requested axis.
DataFrame.median : Return the median of the values over the requested axis.
DataFrame.mode : Get the mode(s) of each element along the requested axis.
DataFrame.std : Return the standard deviation of the values over the
requested axis.

Examples
--------
>>> idx = pd.MultiIndex.from_arrays(
... [["warm", "warm", "cold", "cold"], ["dog", "falcon", "fish", "spider"]],
... names=["blooded", "animal"],
... )
>>> s = pd.Series([4, 2, 0, 8], name="legs", index=idx)
>>> s
blooded animal
warm dog 4
falcon 2
cold fish 0
spider 8
Name: legs, dtype: int64

>>> s.sum()
14

By default, the sum of an empty or all-NA Series is ``0``.

>>> pd.Series([], dtype="float64").sum() # min_count=0 is the default
0.0

This can be controlled with the ``min_count`` parameter. For example, if
you'd like the sum of an empty series to be NaN, pass ``min_count=1``.

>>> pd.Series([], dtype="float64").sum(min_count=1)
nan

Thanks to the ``skipna`` parameter, ``min_count`` handles all-NA and
empty series identically.

>>> pd.Series([np.nan]).sum()
0.0

>>> pd.Series([np.nan]).sum(min_count=1)
nan
"""
result = super().sum(
axis=axis,
skipna=skipna,
Expand Down