Skip to content

REG: fix regression in df.corrwith on tied data when method is spearman #49032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.5.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ Fixed regressions
- Fixed regression in :meth:`DataFrame.apply` when passing non-zero ``axis`` via keyword argument (:issue:`48656`)
- Fixed regression in :meth:`Series.groupby` and :meth:`DataFrame.groupby` when the grouper is a nullable data type (e.g. :class:`Int64`) or a PyArrow-backed string array, contains null values, and ``dropna=False`` (:issue:`48794`)
- Fixed regression in :class:`ExcelWriter` where the ``book`` attribute could no longer be set; however setting this attribute is now deprecated and this ability will be removed in a future version of pandas (:issue:`48780`)
- Fixed regression in :meth:`DataFrame.corrwith` when computing correlation on tied data with ``method="spearman"`` (:issue:`48826`)

.. ---------------------------------------------------------------------------

Expand Down
5 changes: 3 additions & 2 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -10605,11 +10605,12 @@ def corrwith(
0, 1
]
else:
from scipy.stats import rankdata
for i, r in enumerate(ndf):
nonnull_mask = ~np.isnan(r) & ~np.isnan(k)
corrs[cols[i]] = np.corrcoef(
r[nonnull_mask].argsort().argsort(),
k[nonnull_mask].argsort().argsort(),
rankdata(r[nonnull_mask]),
rankdata(k[nonnull_mask]),
)[0, 1]
return Series(corrs)
else:
Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/frame/methods/test_cov_corr.py
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,15 @@ def test_corrwith_spearman(self):
expected = Series(np.ones(len(result)))
tm.assert_series_equal(result, expected)

@td.skip_if_no_scipy
def test_corrwith_spearman_with_tied_data(self):
# GH#48826
df = DataFrame({"A": [2, np.nan, 8, 9], "B": [0, 1, 1, 0]})
s = Series([0, 1, 1, 0])
result = df.corrwith(s, method="spearman")
expected = Series([0.0, 1.0], index=["A", "B"])
tm.assert_series_equal(result, expected)

@td.skip_if_no_scipy
def test_corrwith_kendall(self):
# GH#21925
Expand Down