Skip to content

Commit 462b21d

Browse files
authored
Fix bug in combine first with string dtype and NA only in one level of index (#37568)
1 parent 07c9384 commit 462b21d

File tree

3 files changed

+14
-2
lines changed

3 files changed

+14
-2
lines changed

doc/source/whatsnew/v1.2.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -531,7 +531,7 @@ Reshaping
531531
- Bug in :meth:`Series.transform` would give incorrect results or raise when the argument ``func`` was dictionary (:issue:`35811`)
532532
- Bug in :meth:`DataFrame.pivot` did not preserve :class:`MultiIndex` level names for columns when rows and columns both multiindexed (:issue:`36360`)
533533
- Bug in :func:`join` returned a non deterministic level-order for the resulting :class:`MultiIndex` (:issue:`36910`)
534-
-
534+
- Bug in :meth:`DataFrame.combine_first()` caused wrong alignment with dtype ``string`` and one level of ``MultiIndex`` containing only ``NA`` (:issue:`37591`)
535535

536536
Sparse
537537
^^^^^^

pandas/core/arrays/categorical.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -322,7 +322,7 @@ def __init__(
322322
# sanitize_array coerces np.nan to a string under certain versions
323323
# of numpy
324324
values = maybe_infer_to_datetimelike(values, convert_dates=True)
325-
if not isinstance(values, np.ndarray):
325+
if not isinstance(values, (np.ndarray, ExtensionArray)):
326326
values = com.convert_to_list_like(values)
327327

328328
# By convention, empty lists result in object dtype:

pandas/tests/frame/methods/test_combine_first.py

+12
Original file line numberDiff line numberDiff line change
@@ -353,3 +353,15 @@ def test_combine_first_with_asymmetric_other(self, val):
353353
exp = DataFrame({"isBool": [True], "isNum": [val]})
354354

355355
tm.assert_frame_equal(res, exp)
356+
357+
def test_combine_first_string_dtype_only_na(self):
358+
# GH: 37519
359+
df = DataFrame({"a": ["962", "85"], "b": [pd.NA] * 2}, dtype="string")
360+
df2 = DataFrame({"a": ["85"], "b": [pd.NA]}, dtype="string")
361+
df.set_index(["a", "b"], inplace=True)
362+
df2.set_index(["a", "b"], inplace=True)
363+
result = df.combine_first(df2)
364+
expected = DataFrame(
365+
{"a": ["962", "85"], "b": [pd.NA] * 2}, dtype="string"
366+
).set_index(["a", "b"])
367+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)