Skip to content

implement Timedelta mod, divmod, rmod, rdivmod, fix and test scalar methods #19365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from
Closed
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions doc/source/timedeltas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -283,6 +283,18 @@ Rounded division (floor-division) of a ``timedelta64[ns]`` Series by a scalar
td // pd.Timedelta(days=3, hours=4)
pd.Timedelta(days=3, hours=4) // td

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a ref tag here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean a "(:issue:19365)"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no a section reference

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean something like ".. _timedeltas.divmod:" after line 297?

The mod (%) and divmod operations are defined for ``Timedelta`` when operating with another timedelta-like or with a numeric argument.

.. ipython:: python

pd.Timedelta(hours=37) % datetime.timedelta(hours=2)

# divmod against a timedelta-like returns a pair (int, Timedelta)
divmod(datetime.timedelta(hours=2), pd.Timedelta(minutes=11))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

give a comment or 2 hear, a dense block of code is not very friendly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just give 2 examples not need to have every case covered

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't show the array case, I find that unfriendly


# divmod against a numeric returns a pair (Timedelta, Timedelta)
pd.Timedelta(hours=25) % 86400000000000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing the divmod here


Attributes
----------

Expand Down
21 changes: 21 additions & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,25 @@ resetting indexes. See the :ref:`Sorting by Indexes and Values
# Sort by 'second' (index) and 'A' (column)
df_multi.sort_values(by=['second', 'A'])

.. _whatsnew_0230.enhancements.timedelta_mod

Timedelta mod method
^^^^^^^^^^^^^^^^^^^^

``mod`` (%) and ``divmod`` operations are now defined on ``Timedelta`` objects when operating with either timedelta-like or with numeric arguments. (:issue:`19365`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put in a reference to the docs in the timedelta section.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear on what this means.


.. ipython:: python

td = pd.Timedelta(hours=37)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show the td, then in another ipython block show operations on it.

td

Current Behavior
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need the Current Behavior here (as there isn't any previous)


.. ipython:: python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is repetetive


td % pd.Timedelta(hours=2)
divmod(td, np.array([2, 3], dtype='timedelta64[h]'))

.. _whatsnew_0230.enhancements.ran_inf:

``.rank()`` handles ``inf`` values when ``NaN`` are present
Expand Down Expand Up @@ -438,6 +457,7 @@ Other API Changes
- Set operations (union, difference...) on :class:`IntervalIndex` with incompatible index types will now raise a ``TypeError`` rather than a ``ValueError`` (:issue:`19329`)
- :class:`DateOffset` objects render more simply, e.g. "<DateOffset: days=1>" instead of "<DateOffset: kwds={'days': 1}>" (:issue:`19403`)
- :func:`pandas.merge` provides a more informative error message when trying to merge on timezone-aware and timezone-naive columns (:issue:`15800`)
- :func:`Timedelta.__mod__`, :func:`Timedelta.__divmod__` now accept timedelta-like and numeric arguments instead of raising ``TypeError`` (:issue:`19365`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are covering this above, don't repeat

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in remove this line entirely b/c the "Timedelta mod method" section above exists?


.. _whatsnew_0230.deprecations:

Expand Down Expand Up @@ -552,6 +572,7 @@ Datetimelike
- Bug in comparison of :class:`DatetimeIndex` against ``None`` or ``datetime.date`` objects raising ``TypeError`` for ``==`` and ``!=`` comparisons instead of all-``False`` and all-``True``, respectively (:issue:`19301`)
-


Timezones
^^^^^^^^^

Expand Down
47 changes: 43 additions & 4 deletions pandas/_libs/tslibs/timedeltas.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -492,7 +492,14 @@ def _binary_op_method_timedeltalike(op, name):
if other.dtype.kind not in ['m', 'M']:
# raise rathering than letting numpy return wrong answer
return NotImplemented
return op(self.to_timedelta64(), other)
result = op(self.to_timedelta64(), other)
if other.ndim == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be cleaner to do elif hasattr(other, 'dtype') and other.ndim > 0, and then let the scalar numpy ones take the route of Timedelta below ? (the other = Timedelta(other) some lines below will work for a np.timedelta64

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be about a wash because we also need to handle the case where other is a 0-dim datetime64 array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that already catched by the is_datetime64_object(other) above? (not sure what the method exactly does, but from the name I would expect that)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And from this it seems it is indeed already working correctly:

In [163]: pd.Timedelta(1) + np.datetime64(1000, 'ns')
Out[163]: Timestamp('1970-01-01 00:00:00.000001001')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah there's some ambiguity there. The 0-dim case to watch out for is (under master, fixed in the PR):

>>> pd.Timedelta(1) + np.array('2016-01-02 03:04:05', dtype='datetime64[ns]')
numpy.datetime64('2016-01-02T03:04:05.000000001')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but not sure that should be fixed?

I could see a reasonable argument that the zero-dim array op should return a zero-dim array (i.e. not a Timedelta like the PR changes it to), but given that it returns a scalar, I think the consistent thing to do is always return a Timestamp/Timedelta.

not sure we should add that much extra lines to just deal with numpy scalars vs numpy 0-dim arrays

As long as we a) aren't making spaghetti and b) are testing these new corner cases, I don't see much downside to fixing these. If there were some kind of LOC budget I'd agree with you that this would be a pretty low priority. (and if it will get this merged I'll revert this part, since this is blocking fixes to Index bugs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. (and if it will get this merged I'll revert this part, since this is blocking fixes to Index bugs)

Can you explain why this part would be reverted again, but is needed to be first merged?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Poor use of pronouns on my part. "This part" refers to the zero-dimensional arrays, and is not the bug that originally motivated this PR, but was found in the process of writing tests for this PR. "this is blocking" referred to the divmod/mod and 1-d array ops, which I need to have here before I can fix e.g. pd.Index([0, 1, 2]) * pd.Timedelta(days=1)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any test you added that would catch this 0-dim array case. Would it be ok to just leave that out of this PR, and only fix the numpy timedelta64 scalar case?

We can discuss how many lines of code are added and whether that is worth it, but checking the result type and the potentially converting the result, does add to the code complexity. I would rather add a is_timedelta64_object(other) check to explicitly use the normal timedelta path for those.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm amenable to this suggestion, will update in a bit.

FYI is_timedelta64_object(obj) is equivalent to isinstance(obj, np.timedelta64). It will not catch arrays with timedelta64-dtype.

if other.dtype.kind == 'm':
return Timedelta(result)
if other.dtype.kind == 'M':
from ..tslib import Timestamp
return Timestamp(result)
return result

elif not _validate_ops_compat(other):
return NotImplemented
Expand Down Expand Up @@ -1046,7 +1053,10 @@ class Timedelta(_Timedelta):
def __mul__(self, other):
if hasattr(other, 'dtype'):
# ndarray-like
return other * self.to_timedelta64()
result = other * self.to_timedelta64()
if other.ndim == 0:
return Timedelta(result)
return result

elif other is NaT:
return NaT
Expand All @@ -1061,7 +1071,10 @@ class Timedelta(_Timedelta):

def __truediv__(self, other):
if hasattr(other, 'dtype'):
return self.to_timedelta64() / other
result = self.to_timedelta64() / other
if other.ndim == 0 and result.dtype.kind == 'm':
return Timedelta(result)
return result

elif is_integer_object(other) or is_float_object(other):
# integers or floats
Expand All @@ -1077,7 +1090,10 @@ class Timedelta(_Timedelta):

def __rtruediv__(self, other):
if hasattr(other, 'dtype'):
return other / self.to_timedelta64()
result = other / self.to_timedelta64()
if other.ndim == 0 and result.dtype.kind == 'm':
return Timedelta(result)
return result

elif not _validate_ops_compat(other):
return NotImplemented
Expand All @@ -1096,6 +1112,9 @@ class Timedelta(_Timedelta):
# just defer
if hasattr(other, '_typ'):
# Series, DataFrame, ...
if other._typ == 'dateoffset' and hasattr(other, 'delta'):
# Tick offset
return self // other.delta
return NotImplemented

if hasattr(other, 'dtype'):
Expand Down Expand Up @@ -1128,6 +1147,9 @@ class Timedelta(_Timedelta):
# just defer
if hasattr(other, '_typ'):
# Series, DataFrame, ...
if other._typ == 'dateoffset' and hasattr(other, 'delta'):
# Tick offset
return other.delta // self
return NotImplemented

if hasattr(other, 'dtype'):
Expand All @@ -1149,6 +1171,23 @@ class Timedelta(_Timedelta):
return np.nan
return other.value // self.value

def __mod__(self, other):
# Naive implementation, room for optimization
return self.__divmod__(other)[1]

def __rmod__(self, other):
# Naive implementation, room for optimization
return self.__rdivmod__(other)[1]

def __divmod__(self, other):
# Naive implementation, room for optimization
div = self // other
return div, self - div * other

def __rdivmod__(self, other):
div = other // self
return div, other - div * self


cdef _floordiv(int64_t value, right):
return value // right
Expand Down
Loading