Skip to content

API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) #14164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 22 additions & 3 deletions doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -932,14 +932,16 @@ New Behavior:
Index ``+`` / ``-`` no longer used for set operations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Addition and subtraction of the base Index type (not the numeric subclasses)
Addition and subtraction of the base Index type and of DatetimeIndex
(not the numeric index types)
previously performed set operations (set union and difference). This
behaviour was already deprecated since 0.15.0 (in favor using the specific
``.union()`` and ``.difference()`` methods), and is now disabled. When
possible, ``+`` and ``-`` are now used for element-wise operations, for
example for concatenating strings (:issue:`8227`, :issue:`14127`).
example for concatenating strings or subtracting datetimes
(:issue:`8227`, :issue:`14127`).

Previous Behavior:
Previous behavior:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to standardize here, I see @jorisvandenbossche and @sinhrks changing these!

IIRC I think we do more of Previous Behavior (e.g. capitalized), and american spelling (no Behaviour). But should just pick one :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will do that in #14176

We indeed need to just make a choice between American or British :-)
(is capitalizing all words also an American habit? :-))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 2 capital words sets it off s bit more
using American spelling

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while proofreading the whatsnew, I was thinking of putting the 'New/Old Behavior' in bold, that also sets it off a bit more

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even better


.. code-block:: ipython

Expand All @@ -962,6 +964,23 @@ For example, the behaviour of adding two integer Indexes:

is unchanged. The base ``Index`` is now made consistent with this behaviour.

Further, because of this change, it is now possible to subtract two
DatetimeIndex objects resulting in a TimedeltaIndex:

Previous behavior:

.. code-block:: ipython

In [1]: pd.DatetimeIndex(['2016-01-01', '2016-01-02']) - pd.DatetimeIndex(['2016-01-02', '2016-01-03'])
FutureWarning: using '-' to provide set differences with datetimelike Indexes is deprecated, use .difference()
Out[1]: DatetimeIndex(['2016-01-01'], dtype='datetime64[ns]', freq=None)

New behavior:

.. ipython:: python

pd.DatetimeIndex(['2016-01-01', '2016-01-02']) - pd.DatetimeIndex(['2016-01-02', '2016-01-03'])


.. _whatsnew_0190.api.difference:

Expand Down
12 changes: 4 additions & 8 deletions pandas/tseries/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -628,10 +628,9 @@ def __add__(self, other):
raise TypeError("cannot add TimedeltaIndex and {typ}"
.format(typ=type(other)))
elif isinstance(other, Index):
warnings.warn("using '+' to provide set union with "
"datetimelike Indexes is deprecated, "
"use .union()", FutureWarning, stacklevel=2)
return self.union(other)
raise TypeError("cannot add {typ1} and {typ2}"
.format(typ1=type(self).__name__,
typ2=type(other).__name__))
elif isinstance(other, (DateOffset, timedelta, np.timedelta64,
tslib.Timedelta)):
return self._add_delta(other)
Expand All @@ -656,10 +655,7 @@ def __sub__(self, other):
.format(typ=type(other)))
return self._add_delta(-other)
elif isinstance(other, Index):
warnings.warn("using '-' to provide set differences with "
"datetimelike Indexes is deprecated, "
"use .difference()", FutureWarning, stacklevel=2)
return self.difference(other)
return self._sub_datelike(other)
elif isinstance(other, (DateOffset, timedelta, np.timedelta64,
tslib.Timedelta)):
return self._add_delta(-other)
Expand Down
40 changes: 30 additions & 10 deletions pandas/tseries/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -731,19 +731,39 @@ def _add_datelike(self, other):
def _sub_datelike(self, other):
# subtract a datetime from myself, yielding a TimedeltaIndex
from pandas import TimedeltaIndex
other = Timestamp(other)
if other is tslib.NaT:
result = self._nat_new(box=False)
# require tz compat
elif not self._has_same_tz(other):
raise TypeError("Timestamp subtraction must have the same "
"timezones or no timezones")
if isinstance(other, DatetimeIndex):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this condition move to base?

Copy link
Member Author

@jorisvandenbossche jorisvandenbossche Sep 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the condition to base, but left it here as well (to follow the same pattern as for the other ops: one method in base (eg _add_delta) which in the subclass handles both scalars as index). OK?

# require tz compat
if not self._has_same_tz(other):
raise TypeError("DatetimeIndex subtraction must have the same "
"timezones or no timezones")
result = self._sub_datelike_dti(other)
elif isinstance(other, (tslib.Timestamp, datetime)):
other = Timestamp(other)
if other is tslib.NaT:
result = self._nat_new(box=False)
# require tz compat
elif not self._has_same_tz(other):
raise TypeError("Timestamp subtraction must have the same "
"timezones or no timezones")
else:
i8 = self.asi8
result = i8 - other.value
result = self._maybe_mask_results(result, fill_value=tslib.iNaT)
else:
i8 = self.asi8
result = i8 - other.value
result = self._maybe_mask_results(result, fill_value=tslib.iNaT)
raise TypeError("cannot subtract DatetimeIndex and {typ}"
.format(typ=type(other).__name__))
return TimedeltaIndex(result, name=self.name, copy=False)

def _sub_datelike_dti(self, other):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine. but just want to check that our naming convention is for these types of routines is consistent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for adding a Timedelta-like vs TimedeltaIndex, there are the method _add_delta_td and _add_delta_tdi which are called from _add_delta, so copied that naming convention

"""subtraction of two DatetimeIndexes"""
self_i8 = self.asi8
other_i8 = other.asi8
new_values = self_i8 - other_i8
if self.hasnans or other.hasnans:
mask = (self._isnan) | (other._isnan)
new_values[mask] = tslib.iNaT
return new_values.view('i8')

def _maybe_update_attributes(self, attrs):
""" Update Index attributes (e.g. freq) depending on op """
freq = attrs.get('freq', None)
Expand Down
149 changes: 88 additions & 61 deletions pandas/tseries/tests/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@ def test_resolution(self):
tz=tz)
self.assertEqual(idx.resolution, expected)

def test_add_iadd(self):
def test_union(self):
for tz in self.tz:
# union
rng1 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
Expand All @@ -378,17 +378,34 @@ def test_add_iadd(self):
for rng, other, expected in [(rng1, other1, expected1),
(rng2, other2, expected2),
(rng3, other3, expected3)]:
# GH9094
with tm.assert_produces_warning(FutureWarning):
result_add = rng + other
result_union = rng.union(other)

tm.assert_index_equal(result_add, expected)
result_union = rng.union(other)
tm.assert_index_equal(result_union, expected)
# GH9094
with tm.assert_produces_warning(FutureWarning):

def test_add_iadd(self):
for tz in self.tz:
rng1 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
other1 = pd.date_range('1/6/2000', freq='D', periods=5, tz=tz)
expected1 = pd.date_range('1/1/2000', freq='D', periods=10, tz=tz)

rng2 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
other2 = pd.date_range('1/4/2000', freq='D', periods=5, tz=tz)
expected2 = pd.date_range('1/1/2000', freq='D', periods=8, tz=tz)

rng3 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
other3 = pd.DatetimeIndex([], tz=tz)
expected3 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test subtracting different lengths (ValueError) IIRC; this maybe already tested in another section though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mean this for below (dti-dti)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a test for that (and added code to give this a nicer error message)

for rng, other, expected in [(rng1, other1, expected1),
(rng2, other2, expected2),
(rng3, other3, expected3)]:
# previously performed setop (deprecated in 0.16.0), now
# raises TypeError (GH..)
with tm.assertRaises(TypeError):
rng + other

with tm.assertRaises(TypeError):
rng += other
tm.assert_index_equal(rng, expected)

# offset
offsets = [pd.offsets.Hour(2), timedelta(hours=2),
Expand Down Expand Up @@ -421,7 +438,26 @@ def test_add_iadd(self):
with tm.assertRaisesRegexp(TypeError, msg):
Timestamp('2011-01-01') + idx

def test_sub_isub(self):
def test_add_dti_dti(self):
# previously performed setop (deprecated in 0.16.0), now raises
# TypeError (GH..)

dti = date_range('20130101', periods=3)
dti_tz = date_range('20130101', periods=3).tz_localize('US/Eastern')

with tm.assertRaises(TypeError):
dti + dti

with tm.assertRaises(TypeError):
dti_tz + dti_tz

with tm.assertRaises(TypeError):
dti_tz + dti

with tm.assertRaises(TypeError):
dti + dti_tz

def test_diff(self):
for tz in self.tz:
# diff
rng1 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
Expand All @@ -439,9 +475,22 @@ def test_sub_isub(self):
for rng, other, expected in [(rng1, other1, expected1),
(rng2, other2, expected2),
(rng3, other3, expected3)]:
result_union = rng.difference(other)
result_diff = rng.difference(other)
tm.assert_index_equal(result_diff, expected)

tm.assert_index_equal(result_union, expected)
def test_sub_isub(self):
for tz in self.tz:
rng1 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
other1 = pd.date_range('1/6/2000', freq='D', periods=5, tz=tz)
expected1 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)

rng2 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
other2 = pd.date_range('1/4/2000', freq='D', periods=5, tz=tz)
expected2 = pd.date_range('1/1/2000', freq='D', periods=3, tz=tz)

rng3 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)
other3 = pd.DatetimeIndex([], tz=tz)
expected3 = pd.date_range('1/1/2000', freq='D', periods=5, tz=tz)

# offset
offsets = [pd.offsets.Hour(2), timedelta(hours=2),
Expand All @@ -466,6 +515,30 @@ def test_sub_isub(self):
rng -= 1
tm.assert_index_equal(rng, expected)

def test_sub_dti_dti(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already have dti-scalar (both w and w/o tz) covered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that is handled in the test_sub_isub test above

# previously performed setop (deprecated in 0.16.0), now changed to
# return subtraction -> TimeDeltaIndex (GH ...)

dti = date_range('20130101', periods=3)
dti_tz = date_range('20130101', periods=3).tz_localize('US/Eastern')
dti_tz2 = date_range('20130101', periods=3).tz_localize('UTC')
expected = TimedeltaIndex([0, 0, 0])

result = dti - dti
tm.assert_index_equal(result, expected)

result = dti_tz - dti_tz
tm.assert_index_equal(result, expected)

with tm.assertRaises(TypeError):
dti_tz - dti

with tm.assertRaises(TypeError):
dti - dti_tz

with tm.assertRaises(TypeError):
dti_tz - dti_tz2

def test_sub_period(self):
# GH 13078
# not supported, check TypeError
Expand Down Expand Up @@ -1239,50 +1312,6 @@ def _check(result, expected):
['20121231', '20130101', '20130102'], tz='US/Eastern')
tm.assert_index_equal(result, expected)

def test_dti_dti_deprecated_ops(self):

# deprecated in 0.16.0 (GH9094)
# change to return subtraction -> TimeDeltaIndex in 0.17.0
# shoudl move to the appropriate sections above

dti = date_range('20130101', periods=3)
dti_tz = date_range('20130101', periods=3).tz_localize('US/Eastern')

with tm.assert_produces_warning(FutureWarning):
result = dti - dti
expected = Index([])
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = dti + dti
expected = dti
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = dti_tz - dti_tz
expected = Index([])
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = dti_tz + dti_tz
expected = dti_tz
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = dti_tz - dti
expected = dti_tz
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
result = dti - dti_tz
expected = dti
tm.assert_index_equal(result, expected)

with tm.assert_produces_warning(FutureWarning):
self.assertRaises(TypeError, lambda: dti_tz + dti)
with tm.assert_produces_warning(FutureWarning):
self.assertRaises(TypeError, lambda: dti + dti_tz)

def test_dti_tdi_numeric_ops(self):

# These are normally union/diff set-like ops
Expand Down Expand Up @@ -2053,19 +2082,17 @@ def test_add_iadd(self):
(rng7, other7, expected7)]:

# GH9094
with tm.assert_produces_warning(FutureWarning):
result_add = rng + other
with tm.assertRaises(TypeError):
rng + other

result_union = rng.union(other)

tm.assert_index_equal(result_add, expected)
tm.assert_index_equal(result_union, expected)

# GH 6527
# GH9094
with tm.assert_produces_warning(FutureWarning):
with tm.assertRaises(TypeError):
rng += other
tm.assert_index_equal(rng, expected)

# offset
# DateOffset
Expand Down