Skip to content

Commit 21a3b2f

Browse files
authored
DOC: Remove computation.rst in favor of better docstrings (#46170)
* DOC: Remove computation.rst in favor of better docstrings: * Remove other ref
1 parent 367f8a1 commit 21a3b2f

File tree

9 files changed

+74
-225
lines changed

9 files changed

+74
-225
lines changed

doc/source/user_guide/computation.rst

-212
This file was deleted.

doc/source/user_guide/index.rst

-1
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,6 @@ Guides
7676
boolean
7777
visualization
7878
style
79-
computation
8079
groupby
8180
window
8281
timeseries

doc/source/user_guide/window.rst

+10-4
Original file line numberDiff line numberDiff line change
@@ -427,10 +427,16 @@ can even be omitted:
427427
.. note::
428428

429429
Missing values are ignored and each entry is computed using the pairwise
430-
complete observations. Please see the :ref:`covariance section
431-
<computation.covariance>` for :ref:`caveats
432-
<computation.covariance.caveats>` associated with this method of
433-
calculating covariance and correlation matrices.
430+
complete observations.
431+
432+
Assuming the missing data are missing at random this results in an estimate
433+
for the covariance matrix which is unbiased. However, for many applications
434+
this estimate may not be acceptable because the estimated covariance matrix
435+
is not guaranteed to be positive semi-definite. This could lead to
436+
estimated correlations having absolute values which are greater than one,
437+
and/or a non-invertible covariance matrix. See `Estimation of covariance
438+
matrices <https://en.wikipedia.org/w/index.php?title=Estimation_of_covariance_matrices>`_
439+
for more details.
434440

435441
.. ipython:: python
436442

doc/source/whatsnew/v0.6.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ New features
2424
- :ref:`Added <groupby.multiindex>` multiple levels to groupby (:issue:`103`)
2525
- :ref:`Allow <basics.sorting>` multiple columns in ``by`` argument of ``DataFrame.sort_index`` (:issue:`92`, :issue:`362`)
2626
- :ref:`Added <indexing.basics.get_value>` fast ``get_value`` and ``put_value`` methods to DataFrame (:issue:`360`)
27-
- :ref:`Added <computation.covariance>` ``cov`` instance methods to Series and DataFrame (:issue:`194`, :issue:`362`)
27+
- Added ``cov`` instance methods to Series and DataFrame (:issue:`194`, :issue:`362`)
2828
- :ref:`Added <visualization.barplot>` ``kind='bar'`` option to ``DataFrame.plot`` (:issue:`348`)
2929
- :ref:`Added <basics.idxmin>` ``idxmin`` and ``idxmax`` to Series and DataFrame (:issue:`286`)
3030
- :ref:`Added <io.clipboard>` ``read_clipboard`` function to parse DataFrame from clipboard (:issue:`300`)

doc/source/whatsnew/v0.6.1.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Version 0.6.1 (December 13, 2011)
77
New features
88
~~~~~~~~~~~~
99
- Can append single rows (as Series) to a DataFrame
10-
- Add Spearman and Kendall rank :ref:`correlation <computation.correlation>`
10+
- Add Spearman and Kendall rank correlation
1111
options to Series.corr and DataFrame.corr (:issue:`428`)
1212
- :ref:`Added <indexing.basics.get_value>` ``get_value`` and ``set_value`` methods to
1313
Series, DataFrame, and Panel for very low-overhead access (>2x faster in many
@@ -19,7 +19,7 @@ New features
1919
- Implement new :ref:`SparseArray <sparse.array>` and ``SparseList``
2020
data structures. SparseSeries now derives from SparseArray (:issue:`463`)
2121
- :ref:`Better console printing options <basics.console_output>` (:issue:`453`)
22-
- Implement fast :ref:`data ranking <computation.ranking>` for Series and
22+
- Implement fast data ranking for Series and
2323
DataFrame, fast versions of scipy.stats.rankdata (:issue:`428`)
2424
- Implement ``DataFrame.from_items`` alternate
2525
constructor (:issue:`444`)

doc/source/whatsnew/v0.8.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ Other new features
145145
- Add :ref:`'kde' <visualization.kde>` plot option for density plots
146146
- Support for converting DataFrame to R data.frame through rpy2
147147
- Improved support for complex numbers in Series and DataFrame
148-
- Add :ref:`pct_change <computation.pct_change>` method to all data structures
148+
- Add ``pct_change`` method to all data structures
149149
- Add max_colwidth configuration option for DataFrame console output
150150
- :ref:`Interpolate <missing_data.interpolate>` Series values using index values
151151
- Can select multiple columns from GroupBy

pandas/core/frame.py

+38-2
Original file line numberDiff line numberDiff line change
@@ -9592,6 +9592,14 @@ def corr(
95929592
DataFrame or Series.
95939593
Series.corr : Compute the correlation between two Series.
95949594
9595+
Notes
9596+
-----
9597+
Pearson, Kendall and Spearman correlation are currently computed using pairwise complete observations.
9598+
9599+
* `Pearson correlation coefficient <https://en.wikipedia.org/wiki/Pearson_correlation_coefficient>`_
9600+
* `Kendall rank correlation coefficient <https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient>`_
9601+
* `Spearman's rank correlation coefficient <https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient>`_
9602+
95959603
Examples
95969604
--------
95979605
>>> def histogram_intersection(a, b):
@@ -9603,7 +9611,14 @@ def corr(
96039611
dogs cats
96049612
dogs 1.0 0.3
96059613
cats 0.3 1.0
9606-
"""
9614+
9615+
>>> df = pd.DataFrame([(1, 1), (2, np.nan), (np.nan, 3), (4, 4)],
9616+
... columns=['dogs', 'cats'])
9617+
>>> df.corr(min_periods=3)
9618+
dogs cats
9619+
dogs 1.0 NaN
9620+
cats NaN 1.0
9621+
""" # noqa:E501
96079622
numeric_df = self._get_numeric_data()
96089623
cols = numeric_df.columns
96099624
idx = cols.copy()
@@ -9797,7 +9812,28 @@ def corrwith(self, other, axis: Axis = 0, drop=False, method="pearson") -> Serie
97979812
See Also
97989813
--------
97999814
DataFrame.corr : Compute pairwise correlation of columns.
9800-
"""
9815+
9816+
Examples
9817+
--------
9818+
>>> index = ["a", "b", "c", "d", "e"]
9819+
>>> columns = ["one", "two", "three", "four"]
9820+
>>> df1 = pd.DataFrame(np.arange(20).reshape(5, 4), index=index, columns=columns)
9821+
>>> df2 = pd.DataFrame(np.arange(16).reshape(4, 4), index=index[:4], columns=columns)
9822+
>>> df1.corrwith(df2)
9823+
one 1.0
9824+
two 1.0
9825+
three 1.0
9826+
four 1.0
9827+
dtype: float64
9828+
9829+
>>> df2.corrwith(df1, axis=1)
9830+
a 1.0
9831+
b 1.0
9832+
c 1.0
9833+
d 1.0
9834+
e NaN
9835+
dtype: float64
9836+
""" # noqa:E501
98019837
axis = self._get_axis_number(axis)
98029838
this = self._get_numeric_data()
98039839

pandas/core/generic.py

+13-1
Original file line numberDiff line numberDiff line change
@@ -8522,6 +8522,18 @@ def rank(
85228522
3 spider 8.0
85238523
4 snake NaN
85248524
8525+
Ties are assigned the mean of the ranks (by default) for the group.
8526+
8527+
>>> s = pd.Series(range(5), index=list("abcde"))
8528+
>>> s["d"] = s["b"]
8529+
>>> s.rank()
8530+
a 1.0
8531+
b 2.5
8532+
c 4.0
8533+
d 2.5
8534+
e 5.0
8535+
dtype: float64
8536+
85258537
The following example shows how the method behaves with the above
85268538
parameters:
85278539
@@ -10251,7 +10263,7 @@ def pct_change(
1025110263
periods : int, default 1
1025210264
Periods to shift for forming percent change.
1025310265
fill_method : str, default 'pad'
10254-
How to handle NAs before computing percent changes.
10266+
How to handle NAs **before** computing percent changes.
1025510267
limit : int, default None
1025610268
The number of consecutive NAs to fill before stopping.
1025710269
freq : DateOffset, timedelta, or str, optional

pandas/core/series.py

+9-1
Original file line numberDiff line numberDiff line change
@@ -2566,6 +2566,14 @@ def corr(self, other, method="pearson", min_periods=None) -> float:
25662566
DataFrame.corrwith : Compute pairwise correlation with another
25672567
DataFrame or Series.
25682568
2569+
Notes
2570+
-----
2571+
Pearson, Kendall and Spearman correlation are currently computed using pairwise complete observations.
2572+
2573+
* `Pearson correlation coefficient <https://en.wikipedia.org/wiki/Pearson_correlation_coefficient>`_
2574+
* `Kendall rank correlation coefficient <https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient>`_
2575+
* `Spearman's rank correlation coefficient <https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient>`_
2576+
25692577
Examples
25702578
--------
25712579
>>> def histogram_intersection(a, b):
@@ -2575,7 +2583,7 @@ def corr(self, other, method="pearson", min_periods=None) -> float:
25752583
>>> s2 = pd.Series([.3, .6, .0, .1])
25762584
>>> s1.corr(s2, method=histogram_intersection)
25772585
0.3
2578-
"""
2586+
""" # noqa:E501
25792587
this, other = self.align(other, join="inner", copy=False)
25802588
if len(this) == 0:
25812589
return np.nan

0 commit comments

Comments
 (0)