Skip to content

BUG: DatetimeIndex + arraylike of DateOffsets #18849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 29, 2017
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.22.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -354,3 +354,4 @@ Other

- Improved error message when attempting to use a Python keyword as an identifier in a ``numexpr`` backed query (:issue:`18221`)
- Bug in :class:`Timestamp` where comparison with an array of ``Timestamp`` objects would result in a ``RecursionError`` (:issue:`15183`)
- Bug in :class:`DatetimeIndex` where adding or subtracting an array-like of ``DateOffset`` objects either raised (``np.array``, ``pd.Index``) or broadcast incorrectly (``pd.Series``) (:issue:`18224`)
40 changes: 34 additions & 6 deletions pandas/core/indexes/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from pandas.core.dtypes.common import (
is_integer, is_float,
is_bool_dtype, _ensure_int64,
is_scalar, is_dtype_equal,
is_scalar, is_dtype_equal, is_offsetlike,
is_list_like, is_timedelta64_dtype)
from pandas.core.dtypes.generic import (
ABCIndex, ABCSeries,
Expand Down Expand Up @@ -650,6 +650,7 @@ def _add_datetimelike_methods(cls):
def __add__(self, other):
from pandas.core.index import Index
from pandas.core.indexes.timedeltas import TimedeltaIndex
from pandas.core.indexes.datetimes import DatetimeIndex
from pandas.tseries.offsets import DateOffset
if is_timedelta64_dtype(other):
return self._add_delta(other)
Expand All @@ -662,7 +663,21 @@ def __add__(self, other):
return self._add_delta(other)
elif is_integer(other):
return self.shift(other)
elif isinstance(other, (Index, datetime, np.datetime64)):
elif isinstance(other, (datetime, np.datetime64)):
return self._add_datelike(other)
elif (isinstance(self, DatetimeIndex) and is_offsetlike(other) and
not isinstance(other, ABCSeries)):
# Array of DateOffset objects
if len(other) == 1:
return self + other[0]
else:
from pandas.errors import PerformanceWarning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you are adding code here that already exists in datetimes.py:_add_offset ? is there a reason you are not dispatching to that (which may need to handle array-likes in addition to scalars). Further I am not in love with explicit type checking for DTI here. again why are you not handling that at a higher level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_add_offset is for a single offset which does not have an implementation of apply_index. This is for an array of offsets. It's a case handled by Series but not by DatetimeIndex.

I can make a _add_offset_array and dispatch to that.

warnings.warn("Adding/subtracting array of DateOffsets to "
"{} not vectorized".format(type(self)),
PerformanceWarning)
return self.astype('O') + np.array(other)
# FIXME: This works for __add__ but loses dtype in __sub__
elif isinstance(other, Index):
return self._add_datelike(other)
else: # pragma: no cover
return NotImplemented
Expand All @@ -683,10 +698,6 @@ def __sub__(self, other):
return self._add_delta(-other)
elif isinstance(other, DatetimeIndex):
return self._sub_datelike(other)
elif isinstance(other, Index):
raise TypeError("cannot subtract {typ1} and {typ2}"
.format(typ1=type(self).__name__,
typ2=type(other).__name__))
elif isinstance(other, (DateOffset, timedelta)):
return self._add_delta(-other)
elif is_integer(other):
Expand All @@ -695,6 +706,23 @@ def __sub__(self, other):
return self._sub_datelike(other)
elif isinstance(other, Period):
return self._sub_period(other)
elif (isinstance(self, DatetimeIndex) and is_offsetlike(other) and
not isinstance(other, ABCSeries)):
# Array of DateOffset objects
if len(other) == 1:
return self - other[0]
else:
from pandas.errors import PerformanceWarning
warnings.warn("Adding/subtracting array of DateOffsets to "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see that you added this yourself. Still, feel a little uneasy about this.

"{} not vectorized".format(type(self)),
PerformanceWarning)
res_values = self.astype('O').values - np.array(other)
return self.__class__(res_values, freq='infer')
elif isinstance(other, Index):
raise TypeError("cannot subtract {typ1} and {typ2}"
.format(typ1=type(self).__name__,
typ2=type(other).__name__))

else: # pragma: no cover
return NotImplemented
cls.__sub__ = __sub__
Expand Down
40 changes: 40 additions & 0 deletions pandas/tests/indexes/datetimes/test_arithmetic.py
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,46 @@ def test_datetimeindex_sub_timestamp_overflow(self):
with pytest.raises(OverflowError):
dtimin - variant

@pytest.mark.parametrize('box', [np.array, pd.Index])
def test_dti_add_offset_array(self, tz, box):
dti = pd.date_range('2017-01-01', periods=2, tz=tz)
# TODO: check that `name` propogates correctly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference issue number below all of your new tests.

other = box([pd.offsets.MonthEnd(), pd.offsets.Day(n=2)])
res = dti + other
expected = DatetimeIndex([dti[n] + other[n] for n in range(len(dti))],
name=dti.name, freq='infer')
tm.assert_index_equal(res, expected)

res2 = other + dti
tm.assert_index_equal(res2, expected)

@pytest.mark.parametrize('box', [np.array, pd.Index])
def test_dti_sub_offset_array(self, tz, box):
dti = pd.date_range('2017-01-01', periods=2, tz=tz)
other = box([pd.offsets.MonthEnd(), pd.offsets.Day(n=2)])
res = dti - other
expected = DatetimeIndex([dti[n] - other[n] for n in range(len(dti))],
name=dti.name, freq='infer')
tm.assert_index_equal(res, expected)

def test_dti_with_offset_series(self, tz):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parametrize this with name (None, same name, other name) to make sure they are propagated correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Looks like Series.op(Index) was always taking on the name of the Series.

dti = pd.date_range('2017-01-01', periods=2, tz=tz)
other = pd.Series([pd.offsets.MonthEnd(), pd.offsets.Day(n=2)],
name='foo')

expected_add = pd.Series([dti[n] + other[n] for n in range(len(dti))],
name='foo')
res = dti + other
tm.assert_series_equal(res, expected_add)
res2 = other + dti
tm.assert_series_equal(res2, expected_add)

expected_sub = pd.Series([dti[n] - other[n] for n in range(len(dti))],
name='foo')

res3 = dti - other
tm.assert_series_equal(res3, expected_sub)


# GH 10699
@pytest.mark.parametrize('klass,assert_func', zip([Series, DatetimeIndex],
Expand Down