Skip to content

Fix Timedelta.__floordiv__, __rfloordiv__ #18961

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jan 7, 2018
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,7 @@ Numeric
- Bug in :func:`Series.__sub__` subtracting a non-nanosecond ``np.datetime64`` object from a ``Series`` gave incorrect results (:issue:`7996`)
- Bug in :class:`DatetimeIndex`, :class:`TimedeltaIndex` addition and subtraction of zero-dimensional integer arrays gave incorrect results (:issue:`19012`)
- Bug in :func:`Series.__add__` adding Series with dtype ``timedelta64[ns]`` to a timezone-aware ``DatetimeIndex`` incorrectly dropped timezone information (:issue:`13905`)
- Bug in :func:`Timedelta.__floordiv__` and :func:`Timedelta.__rfloordiv__` dividing by many incompatible numpy objects was incorrectly allowed (:issue:`18846`)
-

Categorical
Expand Down
95 changes: 85 additions & 10 deletions pandas/_libs/tslibs/timedeltas.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -1031,13 +1031,28 @@ class Timedelta(_Timedelta):
__rdiv__ = __rtruediv__

def __floordiv__(self, other):
# numpy does not implement floordiv for timedelta64 dtype, so we cannot
# just defer
if hasattr(other, '_typ'):
# Series, DataFrame, ...
return NotImplemented

if hasattr(other, 'dtype'):
# work with i8
other = other.astype('m8[ns]').astype('i8')
return self.value // other
if other.dtype.kind == 'm':
# also timedelta-like
return _broadcast_floordiv_td64(self.value, other, _floordiv)
elif other.dtype.kind in ['i', 'u', 'f']:
if np.ndim(other) == 0:
return Timedelta(self.value // other)
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need the else

also pls use other.ndim

return self.to_timedelta64() // other

else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just raise you don't need an else

raise TypeError('Invalid dtype {dtype} for '
'{op}'.format(dtype=other.dtype,
op='__floordiv__'))

elif is_integer_object(other):
# integers only
elif is_integer_object(other) or is_float_object(other):
return Timedelta(self.value // other, unit='ns')

elif not _validate_ops_compat(other):
Expand All @@ -1049,20 +1064,80 @@ class Timedelta(_Timedelta):
return self.value // other.value

def __rfloordiv__(self, other):
# numpy does not implement floordiv for timedelta64 dtype, so we cannot
# just defer
if hasattr(other, '_typ'):
# Series, DataFrame, ...
return NotImplemented

if hasattr(other, 'dtype'):
# work with i8
other = other.astype('m8[ns]').astype('i8')
return other // self.value
if other.dtype.kind == 'm':
# also timedelta-like
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc for division we defer to. TDI to handle his logic
does this code path follow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc for division we defer to. TDI to handle his logic

I would find that design choice surprising.

    def __truediv__(self, other):
        if hasattr(other, 'dtype'):
            return self.to_timedelta64() / other

        elif is_integer_object(other) or is_float_object(other):
            # integers or floats
            return Timedelta(self.value / other, unit='ns')

        elif not _validate_ops_compat(other):
            return NotImplemented

        other = Timedelta(other)
        if other is NaT:
            return np.nan
        return self.value / float(other.value)

I think the relevant difference here is for cases where other is an array this returns NotImplemented, but that isn't a viable option for floordiv because numpy doesn't implement floordiv for timedelta64.

return _broadcast_floordiv_td64(self.value, other, _rfloordiv)
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

raise TypeError('Invalid dtype {dtype} for '
'{op}'.format(dtype=other.dtype,
op='__floordiv__'))

if isinstance(other, float) and np.isnan(other):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh?
why are you not just checking isna

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because np.nan (which can't be detected with other is np.nan) gets treated differently from pd.NaT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? nan is equiv of nat here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? nan is equiv of nat here

That's slightly ambiguous, but I lean towards "no it isn't". Timedelta(nan) does return NaT, but Timedelta / float is meaningful on its own, so we should treat nan as a float for the purposes of this operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checknull(other) and is_float_object(other) is more clear here (and add a comment)

return NotImplemented
elif not _validate_ops_compat(other):
return NotImplemented

other = Timedelta(other)
if other is NaT:
return NaT
return np.nan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what we do for division?
is it not tested now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what we do for division?

Yes.

The status quo is... weird.

    def __floordiv__(self, other):
        if hasattr(other, 'dtype'):
            # work with i8
            other = other.astype('m8[ns]').astype('i8')
            return self.value // other

        elif is_integer_object(other):
            # integers only
            return Timedelta(self.value // other, unit='ns')

        elif not _validate_ops_compat(other):
            return NotImplemented

        other = Timedelta(other)
        if other is NaT:
            return np.nan
        return self.value // other.value

    def __rfloordiv__(self, other):
        if hasattr(other, 'dtype'):
            # work with i8
            other = other.astype('m8[ns]').astype('i8')
            return other // self.value

        elif not _validate_ops_compat(other):
            return NotImplemented

        other = Timedelta(other)
        if other is NaT:
            return NaT
        return other.value // self.value

is it not tested now?

Not that I'm aware of.

return other.value // self.value


cdef _floordiv(int64_t value, right):
return value // right


cdef _rfloordiv(int64_t value, right):
# analogous to referencing operator.div, but there is no operator.rfloordiv
return right // value


cdef _broadcast_floordiv_td64(int64_t value, object other,
object (*operation)(int64_t value,
object right)):
"""Boilerplate code shared by Timedelta.__floordiv__ and
Timedelta.__rfloordiv__ because np.timedelta64 does not implement these.

Parameters
----------
value : int64_t; `self.value` from a Timedelta object
other : object
operation : function, either _floordiv or _rfloordiv

Returns
-------
result : varies based on `other`
"""
# assumes other.dtype.kind == 'm', i.e. other is timedelta-like
cdef:
int ndim = getattr(other, 'ndim', -1)

# We need to watch out for np.timedelta64('NaT').
mask = other.view('i8') == NPY_NAT
# equiv np.isnat, which does not exist in some supported np versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment, we DO NOT use np.isnat


if ndim == 0:
if mask:
return np.nan
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no else

return operation(value, other.astype('m8[ns]').astype('i8'))

else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will change, but curious why you have such a strong preference here. I tend to like making these elses explicit (you've no doubt noticed).

res = operation(value, other.astype('m8[ns]').astype('i8'))

if mask.any():
res = res.astype('f8')
res[mask] = np.nan
return res


# resolution in ns
Timedelta.min = Timedelta(np.iinfo(np.int64).min +1)
Timedelta.min = Timedelta(np.iinfo(np.int64).min + 1)
Timedelta.max = Timedelta(np.iinfo(np.int64).max)
88 changes: 88 additions & 0 deletions pandas/tests/scalar/test_timedelta.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ def test_binary_ops_nat(self):
assert (td * pd.NaT) is pd.NaT
assert (td / pd.NaT) is np.nan
assert (td // pd.NaT) is np.nan
assert (td // np.timedelta64('NaT')) is np.nan

def test_binary_ops_integers(self):
td = Timedelta(10, unit='d')
Expand All @@ -162,6 +163,93 @@ def test_binary_ops_with_timedelta(self):
# invalid multiply with another timedelta
pytest.raises(TypeError, lambda: td * td)

def test_floordiv(self):
# GH#18846
td = Timedelta(hours=3, minutes=4)
scalar = Timedelta(hours=3, minutes=3)

# scalar others
assert td // scalar == 1
assert -td // scalar.to_pytimedelta() == -2
assert (2 * td) // scalar.to_timedelta64() == 2

assert td // np.nan is pd.NaT
assert np.isnan(td // pd.NaT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should prob be in test_nat with other nat tests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestTimedeltaArithmetic doesn't have a dedicated test/section for nat tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas/tests/scalar/test_nat is where the nat tests go (i don't mind them here), but should have some (even if slightly duplicative)

assert np.isnan(td // np.timedelta64('NaT'))

with pytest.raises(TypeError):
td // np.datetime64('2016-01-01', dtype='datetime64[us]')

expected = Timedelta(hours=1, minutes=32)
assert td // 2 == expected
assert td // 2.0 == expected
assert td // np.float64(2.0) == expected
assert td // np.int32(2.0) == expected
assert td // np.uint8(2.0) == expected

# Array-like others
assert td // np.array(scalar.to_timedelta64()) == 1

res = (3 * td) // np.array([scalar.to_timedelta64()])
expected = np.array([3], dtype=np.int64)
tm.assert_numpy_array_equal(res, expected)

res = (10 * td) // np.array([scalar.to_timedelta64(),
np.timedelta64('NaT')])
expected = np.array([10, np.nan])
tm.assert_numpy_array_equal(res, expected)

ser = pd.Series([1], dtype=np.int64)
res = td // ser
assert res.dtype.kind == 'm'

def test_rfloordiv(self):
# GH#18846
td = Timedelta(hours=3, minutes=3)
scalar = Timedelta(hours=3, minutes=4)

# scalar others
assert td.__rfloordiv__(scalar) == 1
assert (-td).__rfloordiv__(scalar.to_pytimedelta()) == -2
assert (2 * td).__rfloordiv__(scalar.to_timedelta64()) == 0

assert np.isnan(td.__rfloordiv__(pd.NaT))
assert np.isnan(td.__rfloordiv__(np.timedelta64('NaT')))

dt64 = np.datetime64('2016-01-01', dtype='datetime64[us]')
with pytest.raises(TypeError):
td.__rfloordiv__(dt64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding a bit more comments here enumerating the cases would be useful


assert td.__rfloordiv__(np.nan) is NotImplemented
assert td.__rfloordiv__(3.5) is NotImplemented
assert td.__rfloordiv__(2) is NotImplemented

with pytest.raises(TypeError):
td.__rfloordiv__(np.float64(2.0))
with pytest.raises(TypeError):
td.__rfloordiv__(np.int32(2.0))
with pytest.raises(TypeError):
td.__rfloordiv__(np.uint8(9))

# Array-like others
assert td.__rfloordiv__(np.array(scalar.to_timedelta64())) == 1

res = td.__rfloordiv__(np.array([(3 * scalar).to_timedelta64()]))
expected = np.array([3], dtype=np.int64)
tm.assert_numpy_array_equal(res, expected)

arr = np.array([(10 * scalar).to_timedelta64(),
np.timedelta64('NaT')])
res = td.__rfloordiv__(arr)
expected = np.array([10, np.nan])
tm.assert_numpy_array_equal(res, expected)

ser = pd.Series([1], dtype=np.int64)
res = td.__rfloordiv__(ser)
assert res is NotImplemented
with pytest.raises(TypeError):
ser // td


class TestTimedeltaComparison(object):
def test_comparison_object_array(self):
Expand Down