Skip to content

BUG: appending a Timedelta to Series incorrectly casts to integer #27303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 10, 2019
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.25.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1050,7 +1050,7 @@ Indexing
- Bug which produced ``AttributeError`` on partial matching :class:`Timestamp` in a :class:`MultiIndex` (:issue:`26944`)
- Bug in :class:`Categorical` and :class:`CategoricalIndex` with :class:`Interval` values when using the ``in`` operator (``__contains``) with objects that are not comparable to the values in the ``Interval`` (:issue:`23705`)
- Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.iloc` on a :class:`DataFrame` with a single timezone-aware datetime64[ns] column incorrectly returning a scalar instead of a :class:`Series` (:issue:`27110`)
-
- Bug in setting a new value in a :class:`Series` with a :class:`Timedelta` object incorrectly casting the value to an integer (:issue:`22717`)

Missing
^^^^^^^
Expand Down
21 changes: 17 additions & 4 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
is_scalar,
is_sequence,
is_sparse,
is_timedelta64_dtype,
)
from pandas.core.dtypes.generic import ABCDataFrame, ABCSeries
from pandas.core.dtypes.missing import _infer_fill_value, isna
Expand Down Expand Up @@ -429,11 +430,23 @@ def _setitem_with_indexer(self, indexer, value):
# this preserves dtype of the value
new_values = Series([value])._values
if len(self.obj._values):
try:
new_values = np.concatenate([self.obj._values, new_values])
except TypeError:
if is_timedelta64_dtype(
new_values
) and not is_timedelta64_dtype(self.obj):
# GH#22717 np.concatenate incorrect casts
# timedelta64 to integer
as_obj = self.obj.astype(object)
new_values = np.concatenate([as_obj, new_values])
new_values = np.concatenate(
[as_obj, np.array([value], dtype=object)]
)
else:
try:
new_values = np.concatenate(
[self.obj._values, new_values]
)
except TypeError:
as_obj = self.obj.astype(object)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this hitting the except clause before? Not super familiar with this code but seems like some duplication we could ideally clean up / refactor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not raising in the relevant case, and the np.concatenate would cast the timedelta64 to int64.

seems like some duplication we could ideally clean up / refactor

Yes, this method in particular is in dire need of a refactor. ATM I'm focused on Block._try_coerce_arg and Block._can_hold_element (see other open PRs) and I think/hope once some inconsistencies in those are ironed out, some of the special-casing here may be unnecessary.

new_values = np.concatenate([as_obj, new_values])
self.obj._data = self.obj._constructor(
new_values, index=new_index, name=self.obj.name
)._data
Expand Down
9 changes: 9 additions & 0 deletions pandas/tests/series/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -653,6 +653,15 @@ def test_timedelta_assignment():
expected.loc[[1, 2, 3]] = pd.Timedelta(np.timedelta64(20, "m"))
tm.assert_series_equal(s, expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should really put this in a test_timedelta.py (and take the existing tests out of test_indexing).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after the current batch of PRs I'm planning on doing a review of the indexing tests. There are multiple dimensions along which we can sort/parametrize, any of which would be reasonable, but my guess is we are not being consistent about it.


# GH#22717 inserting a Timedelta should _not_ cast to int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new test pls & parameterize over timedelta & np.timedelta64 as wel

ser = pd.Series(["x"])
ser["td"] = pd.Timedelta("9 days")
assert isinstance(ser["td"], pd.Timedelta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lost previous comment but can you use tm.assert_series_equal here? Also move to a separate test (test_timedelta_assignment_to_object?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you compare vs the expected series instead


ser = pd.Series(["x"])
ser.loc["td"] = pd.Timedelta("9 days")
assert isinstance(ser["td"], pd.Timedelta)


def test_underlying_data_conversion():
# GH 4080
Expand Down