-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
implement Timedelta mod, divmod, rmod, rdivmod, fix and test scalar methods #19365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
4688064
baf4d6b
a2b1ac7
8224871
9558dc9
f838cc9
cd421db
ed99f50
1ace838
0de3bd4
acff328
73fd6dc
4cbb2e1
08fa8fd
03b7e17
120d61f
c1fbdc9
b2995c9
c22dc47
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -492,7 +492,14 @@ def _binary_op_method_timedeltalike(op, name): | |
if other.dtype.kind not in ['m', 'M']: | ||
# raise rathering than letting numpy return wrong answer | ||
return NotImplemented | ||
return op(self.to_timedelta64(), other) | ||
result = op(self.to_timedelta64(), other) | ||
if other.ndim == 0: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't be cleaner to do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that would be about a wash because we also need to handle the case where other is a 0-dim datetime64 array. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't that already catched by the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And from this it seems it is indeed already working correctly:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yah there's some ambiguity there. The 0-dim case to watch out for is (under master, fixed in the PR):
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I could see a reasonable argument that the zero-dim array op should return a zero-dim array (i.e. not a Timedelta like the PR changes it to), but given that it returns a scalar, I think the consistent thing to do is always return a Timestamp/Timedelta.
As long as we a) aren't making spaghetti and b) are testing these new corner cases, I don't see much downside to fixing these. If there were some kind of LOC budget I'd agree with you that this would be a pretty low priority. (and if it will get this merged I'll revert this part, since this is blocking fixes to Index bugs) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can you explain why this part would be reverted again, but is needed to be first merged? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Poor use of pronouns on my part. "This part" refers to the zero-dimensional arrays, and is not the bug that originally motivated this PR, but was found in the process of writing tests for this PR. "this is blocking" referred to the divmod/mod and 1-d array ops, which I need to have here before I can fix e.g. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see any test you added that would catch this 0-dim array case. Would it be ok to just leave that out of this PR, and only fix the numpy timedelta64 scalar case? We can discuss how many lines of code are added and whether that is worth it, but checking the result type and the potentially converting the result, does add to the code complexity. I would rather add a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm amenable to this suggestion, will update in a bit. FYI |
||
if other.dtype.kind == 'm': | ||
return Timedelta(result) | ||
if other.dtype.kind == 'M': | ||
from ..tslib import Timestamp | ||
return Timestamp(result) | ||
return result | ||
|
||
elif not _validate_ops_compat(other): | ||
return NotImplemented | ||
|
@@ -1046,7 +1053,10 @@ class Timedelta(_Timedelta): | |
def __mul__(self, other): | ||
if hasattr(other, 'dtype'): | ||
# ndarray-like | ||
return other * self.to_timedelta64() | ||
result = other * self.to_timedelta64() | ||
if other.ndim == 0: | ||
return Timedelta(result) | ||
return result | ||
|
||
elif other is NaT: | ||
return NaT | ||
|
@@ -1061,7 +1071,10 @@ class Timedelta(_Timedelta): | |
|
||
def __truediv__(self, other): | ||
if hasattr(other, 'dtype'): | ||
return self.to_timedelta64() / other | ||
result = self.to_timedelta64() / other | ||
if other.ndim == 0 and result.dtype.kind == 'm': | ||
return Timedelta(result) | ||
return result | ||
|
||
elif is_integer_object(other) or is_float_object(other): | ||
# integers or floats | ||
|
@@ -1077,7 +1090,10 @@ class Timedelta(_Timedelta): | |
|
||
def __rtruediv__(self, other): | ||
if hasattr(other, 'dtype'): | ||
return other / self.to_timedelta64() | ||
result = other / self.to_timedelta64() | ||
if other.ndim == 0 and result.dtype.kind == 'm': | ||
return Timedelta(result) | ||
return result | ||
|
||
elif not _validate_ops_compat(other): | ||
return NotImplemented | ||
|
@@ -1096,6 +1112,9 @@ class Timedelta(_Timedelta): | |
# just defer | ||
if hasattr(other, '_typ'): | ||
# Series, DataFrame, ... | ||
if other._typ == 'dateoffset' and hasattr(other, 'delta'): | ||
# Tick offset | ||
return self // other.delta | ||
return NotImplemented | ||
|
||
if hasattr(other, 'dtype'): | ||
|
@@ -1128,6 +1147,9 @@ class Timedelta(_Timedelta): | |
# just defer | ||
if hasattr(other, '_typ'): | ||
# Series, DataFrame, ... | ||
if other._typ == 'dateoffset' and hasattr(other, 'delta'): | ||
# Tick offset | ||
return other.delta // self | ||
return NotImplemented | ||
|
||
if hasattr(other, 'dtype'): | ||
|
@@ -1149,6 +1171,23 @@ class Timedelta(_Timedelta): | |
return np.nan | ||
return other.value // self.value | ||
|
||
def __mod__(self, other): | ||
# Naive implementation, room for optimization | ||
return self.__divmod__(other)[1] | ||
|
||
def __rmod__(self, other): | ||
# Naive implementation, room for optimization | ||
return self.__rdivmod__(other)[1] | ||
|
||
def __divmod__(self, other): | ||
# Naive implementation, room for optimization | ||
div = self // other | ||
return div, self - div * other | ||
|
||
def __rdivmod__(self, other): | ||
div = other // self | ||
return div, other - div * self | ||
|
||
|
||
cdef _floordiv(int64_t value, right): | ||
return value // right | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ | |
import pytest | ||
|
||
import numpy as np | ||
from datetime import timedelta | ||
from datetime import datetime, timedelta | ||
|
||
import pandas as pd | ||
import pandas.util.testing as tm | ||
|
@@ -128,6 +128,59 @@ def test_unary_ops(self): | |
assert abs(-td) == td | ||
assert abs(-td) == Timedelta('10d') | ||
|
||
def test_mul(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reference your PR number under all of these added tests. |
||
# GH#19365 | ||
td = Timedelta(minutes=3) | ||
|
||
result = td * 2 | ||
assert result == Timedelta(minutes=6) | ||
|
||
result = td * np.int64(1) | ||
assert isinstance(result, Timedelta) | ||
assert result == td | ||
|
||
result = td * 1.5 | ||
assert result == Timedelta(minutes=4, seconds=30) | ||
|
||
result = td * np.array([3, 4], dtype='int64') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand why this is, but I don't really like it. It would be much more friendly to return a TDI, maybe too magical though There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yah I considered that too, came to the same conclusion. It also would be troublesome if the other array wasn't 1D. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is it easy to just return a TDI? this is very unituitive There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so you haven't answered my question here. this needs to return a TDI or raise TypeError. return a m8 array is simply not useful. |
||
expected = np.array([9, 12], dtype='m8[m]').astype('m8[ns]') | ||
tm.assert_numpy_array_equal(result, expected) | ||
|
||
with pytest.raises(TypeError): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the error message? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a comment on why this is prohibiited, any other ops not allowed? |
||
td * pd.Timestamp(2016, 1, 2) | ||
|
||
def test_add_datetimelike(self): | ||
# GH#19365 | ||
td = Timedelta(10, unit='d') | ||
|
||
result = td + datetime(2016, 1, 1) | ||
assert result == pd.Timestamp(2016, 1, 11) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I KNOW we have tests for this, is there a reason you are duplicating? (rather than moving those to here)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. test_ops_scalar. I have not problem adding tests, but please don't duplicate. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
ATM there is NO test for |
||
result = td + pd.Timestamp('2018-01-12 18:09') | ||
assert result == pd.Timestamp('2018-01-22 18:09') | ||
|
||
result = td + np.datetime64('2018-01-12') | ||
assert result == pd.Timestamp('2018-01-22') | ||
|
||
def test_add_timedeltalike(self): | ||
td = Timedelta(10, unit='d') | ||
|
||
result = td + Timedelta(days=10) | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(days=20) | ||
|
||
result = td + timedelta(days=9) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you test for radd as well here |
||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(days=19) | ||
|
||
result = td + pd.offsets.Hour(6) | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(days=10, hours=6) | ||
|
||
result = td + np.timedelta64(-4, 'D') | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(days=6) | ||
|
||
def test_binary_ops_nat(self): | ||
td = Timedelta(10, unit='d') | ||
|
||
|
@@ -138,6 +191,10 @@ def test_binary_ops_nat(self): | |
assert (td // pd.NaT) is np.nan | ||
assert (td // np.timedelta64('NaT')) is np.nan | ||
|
||
# GH#19365 | ||
assert td - np.timedelta64('NaT') is pd.NaT | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. test of the reverse? |
||
assert td + np.timedelta64('NaT') is pd.NaT | ||
|
||
def test_binary_ops_integers(self): | ||
td = Timedelta(10, unit='d') | ||
|
||
|
@@ -173,6 +230,10 @@ def test_floordiv(self): | |
assert -td // scalar.to_pytimedelta() == -2 | ||
assert (2 * td) // scalar.to_timedelta64() == 2 | ||
|
||
# GH#19365 | ||
assert td // pd.offsets.Hour(1) == 3 | ||
assert td // pd.offsets.Minute(2) == 92 | ||
|
||
assert td // np.nan is pd.NaT | ||
assert np.isnan(td // pd.NaT) | ||
assert np.isnan(td // np.timedelta64('NaT')) | ||
|
@@ -218,6 +279,8 @@ def test_rfloordiv(self): | |
assert (-td).__rfloordiv__(scalar.to_pytimedelta()) == -2 | ||
assert (2 * td).__rfloordiv__(scalar.to_timedelta64()) == 0 | ||
|
||
assert pd.offsets.Hour(1) // Timedelta(minutes=25) == 2 | ||
|
||
assert np.isnan(td.__rfloordiv__(pd.NaT)) | ||
assert np.isnan(td.__rfloordiv__(np.timedelta64('NaT'))) | ||
|
||
|
@@ -255,6 +318,125 @@ def test_rfloordiv(self): | |
with pytest.raises(TypeError): | ||
ser // td | ||
|
||
def test_mod(self): | ||
# GH#19365 | ||
td = Timedelta(hours=37) | ||
|
||
# Timedelta-like others | ||
result = td % Timedelta(hours=6) | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(hours=1) | ||
|
||
result = td % timedelta(minutes=60) | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(0) | ||
|
||
result = td % pd.offsets.Hour(5) | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(hours=2) | ||
|
||
result = td % np.timedelta64(2, 'h') | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(hours=1) | ||
|
||
result = td % NaT | ||
assert result is NaT | ||
|
||
result = td % np.timedelta64('NaT') | ||
assert result is NaT | ||
|
||
# Numeric Others | ||
result = td % 2 | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(0) | ||
|
||
result = td % 1e12 | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(minutes=3, seconds=20) | ||
|
||
result = td % int(1e12) | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(minutes=3, seconds=20) | ||
|
||
# Invalid Others | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you make the error conditions separate and test all of the reverse failures e.g. |
||
with pytest.raises(TypeError): | ||
td % pd.Timestamp('2018-01-22') | ||
|
||
# Array-like others | ||
result = td % np.array([6, 5], dtype='timedelta64[h]') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see my comment above There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comment |
||
expected = np.array([1, 2], dtype='timedelta64[h]').astype('m8[ns]') | ||
tm.assert_numpy_array_equal(result, expected) | ||
|
||
result = td % pd.TimedeltaIndex(['6H', '5H']) | ||
expected = pd.TimedeltaIndex(['1H', '2H']) | ||
tm.assert_index_equal(result, expected) | ||
|
||
result = td % np.array([2, int(1e12)], dtype='i8') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. here as well |
||
expected = np.array([0, Timedelta(minutes=3, seconds=20).value], | ||
dtype='m8[ns]') | ||
tm.assert_numpy_array_equal(result, expected) | ||
|
||
def test_rmod(self): | ||
# GH#19365 | ||
td = Timedelta(minutes=3) | ||
|
||
result = timedelta(minutes=4) % td | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(minutes=1) | ||
|
||
result = np.timedelta64(5, 'm') % td | ||
assert isinstance(result, Timedelta) | ||
assert result == Timedelta(minutes=2) | ||
|
||
result = np.array([5, 6], dtype='m8[m]') % td | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see my comment above. is there a reason you are not testing TDI? (generally whenever you are testing array-like)? The array-like numerics should be with the TDI tests (if they are not in this module). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This ties in to your comments about duplication. In general when we have A.op(B) and B.rop(A) we can either put this test in test_A, test_B, or both. My preference is to order these the same way as NotImplemented behavior goes. Timedelta doesn't know anything about TDI, so test_timedelta doesn't either. (By analogy, Series doesn't know anything about DataFrame...) |
||
expected = np.array([2, 0], dtype='m8[m]').astype('m8[ns]') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same here |
||
tm.assert_numpy_array_equal(result, expected) | ||
|
||
def test_divmod(self): | ||
# GH#19365 | ||
td = Timedelta(days=2, hours=6) | ||
|
||
result = divmod(td, timedelta(days=1)) | ||
assert result[0] == 2 | ||
assert isinstance(result[1], Timedelta) | ||
assert result[1] == Timedelta(hours=6) | ||
|
||
result = divmod(td, pd.offsets.Hour(-4)) | ||
assert result[0] == -14 | ||
assert isinstance(result[1], Timedelta) | ||
assert result[1] == Timedelta(hours=-2) | ||
|
||
result = divmod(td, 54) | ||
assert result[0] == Timedelta(hours=1) | ||
assert isinstance(result[1], Timedelta) | ||
assert result[1] == Timedelta(0) | ||
|
||
result = divmod(td, 53 * 3600 * 1e9) | ||
assert result[0] == Timedelta(1, unit='ns') | ||
assert isinstance(result[1], Timedelta) | ||
assert result[1] == Timedelta(hours=1) | ||
|
||
assert result | ||
result = divmod(td, np.nan) | ||
assert result[0] is pd.NaT | ||
assert result[1] is pd.NaT | ||
|
||
result = divmod(td, pd.NaT) | ||
assert np.isnan(result[0]) | ||
assert result[1] is pd.NaT | ||
|
||
def test_rdivmod(self): | ||
# GH#19365 | ||
result = divmod(timedelta(days=2, hours=6), Timedelta(days=1)) | ||
assert result[0] == 2 | ||
assert isinstance(result[1], Timedelta) | ||
assert result[1] == Timedelta(hours=6) | ||
|
||
result = divmod(pd.offsets.Hour(54), Timedelta(hours=-4)) | ||
assert result[0] == -14 | ||
assert isinstance(result[1], Timedelta) | ||
assert result[1] == Timedelta(hours=-2) | ||
|
||
|
||
class TestTimedeltaComparison(object): | ||
def test_comparison_object_array(self): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this in to a sub-section and give a couple of examples. also add a small example in the timedelta docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put it in api changes section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. For the ipython code blocks, do I copy/paste the traceback?