Skip to content

BUG: transpose casts mixed dtypes to object #43340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.4.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -488,6 +488,7 @@ Reshaping
- Improved error message when creating a :class:`DataFrame` column from a multi-dimensional :class:`numpy.ndarray` (:issue:`42463`)
- :func:`concat` creating :class:`MultiIndex` with duplicate level entries when concatenating a :class:`DataFrame` with duplicates in :class:`Index` and multiple keys (:issue:`42651`)
- Bug in :meth:`pandas.cut` on :class:`Series` with duplicate indices (:issue:`42185`) and non-exact :meth:`pandas.CategoricalIndex` (:issue:`42425`)
- Bug in :meth:`DataFrame.transpose`, where mixed dtypes were cast to ``object`` (:issue:`43337`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is saying the opposite of what you are doing. make this more clear to a reader

- Bug in :meth:`DataFrame.append` failing to retain dtypes when appended columns do not match (:issue:`43392`)
- Bug in :func:`concat` of ``bool`` and ``boolean`` dtypes resulting in ``object`` dtype instead of ``boolean`` dtype (:issue:`42800`)
- Bug in :func:`crosstab` when inputs are are categorical Series, there are categories that are not present in one or both of the Series, and ``margins=True``. Previously the margin value for missing categories was ``NaN``. It is now correctly reported as 0 (:issue:`43505`)
Expand Down
10 changes: 7 additions & 3 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3400,9 +3400,13 @@ def transpose(self, *args, copy: bool = False) -> DataFrame:

else:
new_arr = self.values.T
if copy:
new_arr = new_arr.copy()
result = self._constructor(new_arr, index=self.columns, columns=self.index)
common_dtype = find_common_type(dtypes) if len(dtypes) > 0 else None
result = self._constructor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass copy to the constructor (and are you testing this)?

new_arr,
index=self.columns,
columns=self.index,
dtype=common_dtype,
)

return result.__finalize__(self, method="transpose")

Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/frame/methods/test_transpose.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ def test_transpose_tzaware_2col_mixed_tz(self):
df4 = DataFrame({"A": dti, "B": dti2})
assert (df4.dtypes == [dti.dtype, dti2.dtype]).all()
assert (df4.T.dtypes == object).all()
print(df4._can_fast_transpose, df4.T._can_fast_transpose)
tm.assert_frame_equal(df4.T.T, df4)

@pytest.mark.parametrize("tz", [None, "America/New_York"])
Expand All @@ -57,6 +58,7 @@ def test_transpose_object_to_tzaware_mixed_tz(self):
df2 = DataFrame([dti, dti2])
assert (df2.dtypes == object).all()
res2 = df2.T
print("\n", res2.dtypes, [dti.dtype, dti2.dtype])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove prints

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the prints
these are the only two tests failing (but not on my local, was trying to figure why)

assert (res2.dtypes == [dti.dtype, dti2.dtype]).all()

def test_transpose_uint64(self, uint64_frame):
Expand Down Expand Up @@ -103,3 +105,10 @@ def test_transpose_get_view_dt64tzget_view(self):

rtrip = result._mgr.blocks[0].values
assert np.shares_memory(arr._data, rtrip._data)

def test_transpose_mixed_dtypes(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test the copy keyword. test an example which does keep object (e.g. (1, 1.5) and ( 'a', 1)) for (a, b) and test with simple dtypes as well (e.g. int64)

# GH#43337
df = DataFrame({"a": [1], "b": [2]}).astype({"b": "Int64"})
result = df.T
expected = DataFrame([1, 2], index=["a", "b"], dtype="Int64")
tm.assert_frame_equal(result, expected)