-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: fix tz-aware DatetimeIndex + TimedeltaIndex (#17558) #17583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,6 +13,7 @@ | |
is_integer, is_float, | ||
is_bool_dtype, _ensure_int64, | ||
is_scalar, is_dtype_equal, | ||
is_timedelta64_dtype, is_integer_dtype, | ||
is_list_like) | ||
from pandas.core.dtypes.generic import ( | ||
ABCIndex, ABCSeries, | ||
|
@@ -651,6 +652,15 @@ def __add__(self, other): | |
raise TypeError("cannot add {typ1} and {typ2}" | ||
.format(typ1=type(self).__name__, | ||
typ2=type(other).__name__)) | ||
elif isinstance(other, np.ndarray): | ||
if is_timedelta64_dtype(other): | ||
return self._add_delta(TimedeltaIndex(other)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can only accept timedelta64 here, integer is not valid either fro datetimelikes in add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As noted in another comment above, the integer case is being used by internals for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. no we don't allow this user facting; this is ONLY an internal operation There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, but I don't quite understand what you're getting at -- are you saying there's still something to be done here? I just added the integer branch to prevent aforementioned internally used operation from breaking. (since it returns |
||
elif is_integer_dtype(other): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think instead of this, we want to specifically define add in PeriodIndex which handles this case and calls super otherwise. |
||
return NotImplemented | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have a test that hits this branch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes but indirectly, without this branch it was breaking a couple of other tests that added a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Direct test is certainly appreciated. Go for it! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, I looked into the cases getting triggered with |
||
else: | ||
raise TypeError("cannot add {typ1} and np.ndarray[{typ2}]" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same question as with the |
||
.format(typ1=type(self).__name__, | ||
typ2=other.dtype)) | ||
elif isinstance(other, (DateOffset, timedelta, np.timedelta64, | ||
Timedelta)): | ||
return self._add_delta(other) | ||
|
@@ -731,7 +741,7 @@ def _add_delta_tdi(self, other): | |
if self.hasnans or other.hasnans: | ||
mask = (self._isnan) | (other._isnan) | ||
new_values[mask] = iNaT | ||
return new_values.view(self.dtype) | ||
return new_values.view('i8') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. leave this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This fixes adding tz-aware There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes use is_datetime64tz_dtype(self.dtype) is the correct idiom. but you'll have to demonstrate this bug specifically There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Posted stack trace that hits this line above. So I can change this to if is_datetime64tz_dtype(self.dtype):
return new_values.view('i8')
return new_values.view(self.dtype) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is the not the right fix. tz-aware need to be turned into local i8, added, then turned back into the original timezone via localization. normally we do add i8 directly, BUT you cannot do this with tz-aware, e.g. the days are shifted otherwise (imagine adding days), or don't work when you cross DST. pandas.tseries.offsets already does all of this. I haven't stepped thru this code lately, but need to hit a similar path to that. |
||
|
||
def isin(self, values): | ||
""" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -460,6 +460,20 @@ def test_add_dti_dti(self): | |
with pytest.raises(TypeError): | ||
dti + dti_tz | ||
|
||
def test_add_dti_ndarray(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a test which raises for numpy arrays of incorrect type (e.g. not m8[ns]); |
||
# GH 17558 | ||
# Check that tz-aware DatetimeIndex + np.array(dtype="timedelta64") | ||
# and DatetimeIndex + TimedeltaIndex work as expected | ||
dti = pd.DatetimeIndex([pd.Timestamp("2017/01/01")]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can move this entire test to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we should prob move more of these add tests there as well, but can do that in another PR |
||
dti = dti.tz_localize('US/Eastern') | ||
expected = pd.DatetimeIndex([pd.Timestamp("2017/01/01 01:00")]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit : add newline above this one. |
||
expected = expected.tz_localize('US/Eastern') | ||
|
||
td_np = np.array([np.timedelta64(1, 'h')], dtype="timedelta64[ns]") | ||
tm.assert_index_equal(dti + td_np, expected) | ||
tm.assert_index_equal(dti + td_np[0], expected) | ||
tm.assert_index_equal(dti + TimedeltaIndex(td_np), expected) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we just for-loop these checks instead? Also, let's be a little more explicit and write it this way: actual = dti + ...
tm.assert_index_equal(actual, expected) |
||
|
||
def test_difference(self): | ||
for tz in self.tz: | ||
# diff | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not correct; the bug is only with a numpy array of
dtype=timedelta64[ns]
(timedelta64 is not a valid dtype for pandas)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "or" was a bit confusing, but there are two separate bugs:
DatetimeIndex
withTimedeltaIndex
DatetimeIndex
with numpy array which is timedelta64-dtype-likeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how so? where does DTI + TDI tz-aware not work? your issues is w.r.t. np.array of timedeltas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mentioned it briefly in the issue ticket, although I neglected to post the full stack trace: