-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Implement mul, floordiv, mod, divmod, and reversed directly in TimedeltaArray #23885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @jbrockmendel! Thanks for submitting the PR.
|
Codecov Report
@@ Coverage Diff @@
## master #23885 +/- ##
===========================================
- Coverage 92.31% 42.36% -49.95%
===========================================
Files 161 161
Lines 51562 51669 +107
===========================================
- Hits 47599 21891 -25708
- Misses 3963 29778 +25815
Continue to review full report at Codecov.
|
@@ -777,6 +777,12 @@ def wrap_arithmetic_op(self, other, result): | |||
if result is NotImplemented: | |||
return NotImplemented | |||
|
|||
if isinstance(result, tuple): | |||
# divmod, rdivmod | |||
assert len(result) == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this bubble up in any way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the whole isinstance
block or the len-2 assertion? The former is necessary, the latter is just protecting against me being a dummy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Referring just to the len-2
. While I appreciate dummy-protection, I would still want to know if that assert
would surface for end-users, that's all.
tm.assert_equal(result, expected) | ||
|
||
with pytest.raises(TypeError): | ||
2 % tdarr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
tm.assert_equal(result, expected) | ||
|
||
if box_with_array is pd.DataFrame: | ||
pytest.xfail("DataFrame does not have __divmod__ or __rdivmod__") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have two distinct tests here as delineated by this xfail
check. In the interest of modularity, IMO this test should be broken up into two.
Similar comment goes for your other tests where there is an xfail
like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
Broader question here: given the amount of moving parts in different PR's, to what extent is your implementations in this PR tested (or will be tested)? I ask because I see a lot of branching logic, but I'm uncertain as to how many of the branches are covered. |
I've tried to keep these non-overlapping. This PR should definitely wait until after #23829 and #23789. While there is always room for improvement in the tests, the existing tests in test_timedelta64 cover these methods pretty well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same concerns as on other PR
can you rebase |
rebased+green |
return result | ||
|
||
elif is_object_dtype(other): | ||
result = [other[n] // self[n] for n in range(len(self))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any nice way to remove some of this duplication (floordir / div), maybe with some helper functions.
def __rmod__(self, other): | ||
# Note: This is a naive implementation, can likely be optimized | ||
if isinstance(other, (ABCSeries, ABCDataFrame, ABCIndexClass)): | ||
return NotImplemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this share with __mod__
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. For div and divmod either timedeltas or numeric are valid. For reversed ops only timedeltas are valid.
return res1, res2 | ||
|
||
def __rdivmod__(self, other): | ||
# Note: This is a naive implementation, can likely be optimized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this share with __divmod__
?
other = np.array(other) | ||
if len(other) != len(self) and not is_timedelta64_dtype(other): | ||
# Exclude timedelta64 here so we correctly raise TypeError | ||
# for that instead of ValueError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of this impl matches how this is done in Timedelta. too bad can't easily share.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After DTA/TDA are done I might look at this more seriously.
.format(typ=dtype, cls=type(self).__name__)) | ||
|
||
def __rfloordiv__(self, other): | ||
if isinstance(other, (ABCSeries, ABCDataFrame, ABCIndexClass)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this share with floordiv?
if isinstance(other, ABCSeries): | ||
# GH#19042 | ||
def __mul__(self, other): | ||
other = lib.item_from_zerodim(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you don't do this everywhere?
maybe add a function that you call to avoid repeating code
def _prepare_other(other):
other = lib.item_from_zerodim(other)
if is_list_like(other) and not hasattr(other, "dtype"):
# list, tuple
other = np.array(other)
return other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #23853 for exactly this reason. It merits a dedicated PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, but now you are missing some code here on several of the operators (from_zero_dim), so maybe better to fix now.
raise ValueError("Cannot multiply with unequal lengths") | ||
|
||
if is_object_dtype(other): | ||
# this multiplication will succeed only if all elements of other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be useful to put this in a method that can be shared across multiple ops
other = type(self)(other) | ||
|
||
# numpy timedelta64 does not natively support floordiv, so operate | ||
# on the i8 values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks an awful lot like _maybe_mask_missing (in Index)
thanks @jbrockmendel same comment as on other PR.s let's really really try to limit code adding. |
Sounds good. This was the last PR I had for TimedeltaArray. The only one I have left for DatetimeArray is to implement DatetimeArray._from_sequence. Following that, all I have on my plate is a) to_datetime/array_to_datetime improvements, b) parametrizing+de-duplicating arithmetic tests, and c) DTA/TDA/PA reductions. All three can wait until after 0.24.0. |
git diff upstream/master -u -- "*.py" | flake8 --diff
Companion to #23829, will need rebasing after that goes through. Some overlap with #23789; this will have a smaller diff after being rebased on top of that.