Skip to content

Commit 061a51f

Browse files
committed
BUG: Inconsisten result for corr with constant columns
1 parent efb068f commit 061a51f

File tree

3 files changed

+13
-2
lines changed

3 files changed

+13
-2
lines changed

doc/source/whatsnew/v1.2.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -404,6 +404,7 @@ Numeric
404404
- Bug in :class:`IntervalArray` comparisons with :class:`Series` not returning :class:`Series` (:issue:`36908`)
405405
- Bug in :class:`DataFrame` allowing arithmetic operations with list of array-likes with undefined results. Behavior changed to raising ``ValueError`` (:issue:`36702`)
406406
- Bug in :meth:`DataFrame.std`` with ``timedelta64`` dtype and ``skipna=False`` (:issue:`37392`)
407+
- Bug in :meth:`DataFrame.corr` returned inconsistent results for constant columns (:issue:`37448`)
407408

408409
Conversion
409410
^^^^^^^^^^

pandas/_libs/algos.pyx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -300,7 +300,6 @@ def nancorr(const float64_t[:, :] mat, bint cov=False, minp=None):
300300

301301
# now the cov numerator
302302
sumx = 0
303-
304303
for i in range(N):
305304
if mask[i, xi] and mask[i, yi]:
306305
vx = mat[i, xi] - meanx
@@ -312,7 +311,8 @@ def nancorr(const float64_t[:, :] mat, bint cov=False, minp=None):
312311

313312
divisor = (nobs - 1.0) if cov else sqrt(sumxx * sumyy)
314313

315-
if divisor != 0:
314+
# numerical issues for constant columns
315+
if divisor > 1e-15:
316316
result[xi, yi] = result[yi, xi] = sumx / divisor
317317
else:
318318
result[xi, yi] = result[yi, xi] = NaN

pandas/tests/frame/methods/test_cov_corr.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,16 @@ def test_corr_item_cache(self):
208208
assert df["A"] is ser
209209
assert df.values[0, 0] == 99
210210

211+
@pytest.mark.parametrize("length", [2, 20, 200, 2000, 20000])
212+
def test_corr_for_constant_columns(self, length):
213+
# GH: 37448
214+
df = DataFrame(length * [[0.4, 0.1]], columns=["A", "B"])
215+
result = df.corr()
216+
expected = pd.DataFrame(
217+
{"A": [np.nan, np.nan], "B": [np.nan, np.nan]}, index=["A", "B"]
218+
)
219+
tm.assert_frame_equal(result, expected)
220+
211221

212222
class TestDataFrameCorrWith:
213223
def test_corrwith(self, datetime_frame):

0 commit comments

Comments
 (0)