Skip to content

BUG: casting dt64/td64 in DataFrame.reindex #39759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 12, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.3.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,7 @@ Indexing
- Bug in :meth:`DataFrame.__setitem__` raising ``ValueError`` when setting multiple values to duplicate columns (:issue:`15695`)
- Bug in :meth:`DataFrame.loc`, :meth:`Series.loc`, :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` returning incorrect elements for non-monotonic :class:`DatetimeIndex` for string slices (:issue:`33146`)
- Bug in :meth:`DataFrame.reindex` and :meth:`Series.reindex` with timezone aware indexes raising ``TypeError`` for ``method="ffill"`` and ``method="bfill"`` and specified ``tolerance`` (:issue:`38566`)
- Bug in :meth:`DataFrame.reindex` with ``datetime64[ns]`` or ``timedelta64[ns]`` incorrectly casting to integers when the ``fill_value`` requires casting to object dtype (:issue:`39755`)
- Bug in :meth:`DataFrame.__setitem__` raising ``ValueError`` with empty :class:`DataFrame` and specified columns for string indexer and non empty :class:`DataFrame` to set (:issue:`38831`)
- Bug in :meth:`DataFrame.loc.__setitem__` raising ValueError when expanding unique column for :class:`DataFrame` with duplicate columns (:issue:`38521`)
- Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
Expand Down
3 changes: 3 additions & 0 deletions pandas/core/algorithms.py
Original file line number Diff line number Diff line change
Expand Up @@ -1388,6 +1388,9 @@ def wrapper(arr, indexer, out, fill_value=np.nan):

def _convert_wrapper(f, conv_dtype):
def wrapper(arr, indexer, out, fill_value=np.nan):
if conv_dtype == object:
# GH#39755 avoid casting dt64/td64 to integers
arr = ensure_wrapped_if_datetimelike(arr)
arr = arr.astype(conv_dtype)
f(arr, indexer, out, fill_value=fill_value)

Expand Down
35 changes: 35 additions & 0 deletions pandas/tests/frame/methods/test_reindex.py
Original file line number Diff line number Diff line change
Expand Up @@ -933,3 +933,38 @@ def test_reindex_empty(self, src_idx, cat_idx):
result = df.reindex(columns=cat_idx)
expected = DataFrame(index=["K"], columns=cat_idx, dtype="f8")
tm.assert_frame_equal(result, expected)

@pytest.mark.parametrize("dtype", ["m8[ns]", "M8[ns]"])
def test_reindex_datetimelike_to_object(self, dtype):
# GH#39755 dont cast dt64/td64 to ints
mi = MultiIndex.from_product([list("ABCDE"), range(2)])

dti = date_range("2016-01-01", periods=10)
fv = np.timedelta64("NaT", "ns")
if dtype == "m8[ns]":
dti = dti - dti[0]
fv = np.datetime64("NaT", "ns")

ser = Series(dti, index=mi)
ser[::3] = pd.NaT

df = ser.unstack()

index = df.index.append(Index([1]))
columns = df.columns.append(Index(["foo"]))

res = df.reindex(index=index, columns=columns, fill_value=fv)

expected = DataFrame(
{
0: df[0].tolist() + [fv],
1: df[1].tolist() + [fv],
"foo": np.array(["NaT"] * 6, dtype=fv.dtype),
},
index=index,
)
assert (res.dtypes[[0, 1]] == object).all()
assert res.iloc[0, 0] is pd.NaT
assert res.iloc[-1, 0] is fv
assert res.iloc[-1, 1] is fv
tm.assert_frame_equal(res, expected)