Skip to content

Commit 904af03

Browse files
Daniel Saxtonjreback
Daniel Saxton
authored andcommitted
PR #22761: add cookbook entry for callable correlation method (#22761)
1 parent 66c2e5f commit 904af03

File tree

1 file changed

+36
-0
lines changed

1 file changed

+36
-0
lines changed

doc/source/cookbook.rst

+36
Original file line numberDiff line numberDiff line change
@@ -1223,6 +1223,42 @@ Computation
12231223
`Numerical integration (sample-based) of a time series
12241224
<http://nbviewer.ipython.org/5720498>`__
12251225

1226+
Correlation
1227+
***********
1228+
1229+
The `method` argument within `DataFrame.corr` can accept a callable in addition to the named correlation types. Here we compute the `distance correlation <https://en.wikipedia.org/wiki/Distance_correlation>`__ matrix for a `DataFrame` object.
1230+
1231+
.. ipython:: python
1232+
1233+
def distcorr(x, y):
1234+
n = len(x)
1235+
a = np.zeros(shape=(n, n))
1236+
b = np.zeros(shape=(n, n))
1237+
1238+
for i in range(n):
1239+
for j in range(i + 1, n):
1240+
a[i, j] = abs(x[i] - x[j])
1241+
b[i, j] = abs(y[i] - y[j])
1242+
1243+
a += a.T
1244+
b += b.T
1245+
1246+
a_bar = np.vstack([np.nanmean(a, axis=0)] * n)
1247+
b_bar = np.vstack([np.nanmean(b, axis=0)] * n)
1248+
1249+
A = a - a_bar - a_bar.T + np.full(shape=(n, n), fill_value=a_bar.mean())
1250+
B = b - b_bar - b_bar.T + np.full(shape=(n, n), fill_value=b_bar.mean())
1251+
1252+
cov_ab = np.sqrt(np.nansum(A * B)) / n
1253+
std_a = np.sqrt(np.sqrt(np.nansum(A**2)) / n)
1254+
std_b = np.sqrt(np.sqrt(np.nansum(B**2)) / n)
1255+
1256+
return cov_ab / std_a / std_b
1257+
1258+
df = pd.DataFrame(np.random.normal(size=(100, 3)))
1259+
1260+
df.corr(method=distcorr)
1261+
12261262
Timedeltas
12271263
----------
12281264

0 commit comments

Comments
 (0)