Skip to content

DOC: Enforce Numpy Docstring Validation for pandas.DataFrame.var #58568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
-i "pandas.DataFrame.sum RT03" \
-i "pandas.DataFrame.swaplevel SA01" \
-i "pandas.DataFrame.to_markdown SA01" \
-i "pandas.DataFrame.var PR01,RT03,SA01" \
-i "pandas.Grouper PR02" \
-i "pandas.Index PR07" \
-i "pandas.Index.join PR07,RT03,SA01" \
Expand Down
70 changes: 69 additions & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -12065,7 +12065,6 @@ def var(
) -> Series | Any: ...

@deprecate_nonkeyword_arguments(version="3.0", allowed_args=["self"], name="var")
@doc(make_doc("var", ndim=2))
def var(
self,
axis: Axis | None = 0,
Expand All @@ -12074,6 +12073,75 @@ def var(
numeric_only: bool = False,
**kwargs,
) -> Series | Any:
"""
Return unbiased variance over requested axis.

Normalized by N-1 by default. This can be changed using the ddof argument.

Parameters
----------
axis : {index (0), columns (1)}
For `Series` this parameter is unused and defaults to 0.

.. warning::

The behavior of DataFrame.var with ``axis=None`` is deprecated,
in a future version this will reduce over both axes and return a scalar
To retain the old behavior, pass axis=0 (or do not pass axis).

skipna : bool, default True
Exclude NA/null values. If an entire row/column is NA, the result
will be NA.
ddof : int, default 1
Delta Degrees of Freedom. The divisor used in calculations is N - ddof,
where N represents the number of elements.
numeric_only : bool, default False
Include only float, int, boolean columns. Not implemented for Series.
**kwargs :
Additional keywords passed.

Returns
-------
Series or scalaer
Unbiased variance over requested axis.

See Also
--------
numpy.var : Equivalent function in NumPy.
Series.var : Return unbiased variance over Series values.
Series.std : Return standard deviation over Series values.
DataFrame.std : Return standard deviation of the values over
the requested axis.

Examples
--------
>>> df = pd.DataFrame(
... {
... "person_id": [0, 1, 2, 3],
... "age": [21, 25, 62, 43],
... "height": [1.61, 1.87, 1.49, 2.01],
... }
... ).set_index("person_id")
>>> df
age height
person_id
0 21 1.61
1 25 1.87
2 62 1.49
3 43 2.01

>>> df.var()
age 352.916667
height 0.056367
dtype: float64

Alternatively, ``ddof=0`` can be set to normalize by N instead of N-1:

>>> df.var(ddof=0)
age 264.687500
height 0.042275
dtype: float64
"""
result = super().var(
axis=axis, skipna=skipna, ddof=ddof, numeric_only=numeric_only, **kwargs
)
Expand Down