Skip to content

Implement mul, floordiv, mod, divmod, and reversed directly in TimedeltaArray #23885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 3, 2018

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Companion to #23829, will need rebasing after that goes through. Some overlap with #23789; this will have a smaller diff after being rebased on top of that.

@pep8speaks
Copy link

Hello @jbrockmendel! Thanks for submitting the PR.

@codecov
Copy link

codecov bot commented Nov 24, 2018

Codecov Report

Merging #23885 into master will decrease coverage by 49.94%.
The diff coverage is 17.94%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #23885       +/-   ##
===========================================
- Coverage   92.31%   42.36%   -49.95%     
===========================================
  Files         161      161               
  Lines       51562    51669      +107     
===========================================
- Hits        47599    21891    -25708     
- Misses       3963    29778    +25815
Flag Coverage Δ
#multiple ?
#single 42.36% <17.94%> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/ops.py 55.2% <ø> (-38.94%) ⬇️
pandas/core/indexes/datetimelike.py 53.35% <33.33%> (-44.07%) ⬇️
pandas/core/arrays/timedeltas.py 38.96% <5.55%> (-57.04%) ⬇️
pandas/core/indexes/base.py 55.91% <71.42%> (-40.42%) ⬇️
pandas/core/indexes/timedeltas.py 40.84% <75%> (-49.28%) ⬇️
pandas/io/formats/latex.py 0% <0%> (-100%) ⬇️
pandas/core/categorical.py 0% <0%> (-100%) ⬇️
pandas/io/sas/sas_constants.py 0% <0%> (-100%) ⬇️
pandas/tseries/plotting.py 0% <0%> (-100%) ⬇️
pandas/tseries/converter.py 0% <0%> (-100%) ⬇️
... and 124 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 022f458...987eecd. Read the comment docs.

@gfyoung gfyoung added Refactor Internal refactoring of code Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type labels Nov 25, 2018
@@ -777,6 +777,12 @@ def wrap_arithmetic_op(self, other, result):
if result is NotImplemented:
return NotImplemented

if isinstance(result, tuple):
# divmod, rdivmod
assert len(result) == 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this bubble up in any way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the whole isinstance block or the len-2 assertion? The former is necessary, the latter is just protecting against me being a dummy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Referring just to the len-2. While I appreciate dummy-protection, I would still want to know if that assert would surface for end-users, that's all.

tm.assert_equal(result, expected)

with pytest.raises(TypeError):
2 % tdarr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check error message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

tm.assert_equal(result, expected)

if box_with_array is pd.DataFrame:
pytest.xfail("DataFrame does not have __divmod__ or __rdivmod__")
Copy link
Member

@gfyoung gfyoung Nov 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have two distinct tests here as delineated by this xfail check. In the interest of modularity, IMO this test should be broken up into two.

Similar comment goes for your other tests where there is an xfail like this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@gfyoung
Copy link
Member

gfyoung commented Nov 25, 2018

Broader question here: given the amount of moving parts in different PR's, to what extent is your implementations in this PR tested (or will be tested)? I ask because I see a lot of branching logic, but I'm uncertain as to how many of the branches are covered.

@jbrockmendel
Copy link
Member Author

Broader question here: given the amount of moving parts in different PR's, to what extent is your implementations in this PR tested (or will be tested)? I ask because I see a lot of branching logic, but I'm uncertain as to how many of the branches are covered.

I've tried to keep these non-overlapping. This PR should definitely wait until after #23829 and #23789.

While there is always room for improvement in the tests, the existing tests in test_timedelta64 cover these methods pretty well.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same concerns as on other PR

@jreback
Copy link
Contributor

jreback commented Nov 29, 2018

can you rebase

@jbrockmendel
Copy link
Member Author

rebased+green

return result

elif is_object_dtype(other):
result = [other[n] // self[n] for n in range(len(self))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any nice way to remove some of this duplication (floordir / div), maybe with some helper functions.

def __rmod__(self, other):
# Note: This is a naive implementation, can likely be optimized
if isinstance(other, (ABCSeries, ABCDataFrame, ABCIndexClass)):
return NotImplemented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this share with __mod__?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. For div and divmod either timedeltas or numeric are valid. For reversed ops only timedeltas are valid.

return res1, res2

def __rdivmod__(self, other):
# Note: This is a naive implementation, can likely be optimized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this share with __divmod__?

other = np.array(other)
if len(other) != len(self) and not is_timedelta64_dtype(other):
# Exclude timedelta64 here so we correctly raise TypeError
# for that instead of ValueError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lots of this impl matches how this is done in Timedelta. too bad can't easily share.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After DTA/TDA are done I might look at this more seriously.

.format(typ=dtype, cls=type(self).__name__))

def __rfloordiv__(self, other):
if isinstance(other, (ABCSeries, ABCDataFrame, ABCIndexClass)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this share with floordiv?

if isinstance(other, ABCSeries):
# GH#19042
def __mul__(self, other):
other = lib.item_from_zerodim(other)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you don't do this everywhere?

maybe add a function that you call to avoid repeating code

def _prepare_other(other):
   other = lib.item_from_zerodim(other)
   if is_list_like(other) and not hasattr(other, "dtype"):
            # list, tuple
            other = np.array(other)
   return other

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #23853 for exactly this reason. It merits a dedicated PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but now you are missing some code here on several of the operators (from_zero_dim), so maybe better to fix now.

raise ValueError("Cannot multiply with unequal lengths")

if is_object_dtype(other):
# this multiplication will succeed only if all elements of other
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be useful to put this in a method that can be shared across multiple ops

other = type(self)(other)

# numpy timedelta64 does not natively support floordiv, so operate
# on the i8 values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks an awful lot like _maybe_mask_missing (in Index)

@jreback jreback added this to the 0.24.0 milestone Dec 3, 2018
@jreback jreback merged commit 21568c5 into pandas-dev:master Dec 3, 2018
@jreback
Copy link
Contributor

jreback commented Dec 3, 2018

thanks @jbrockmendel

same comment as on other PR.s let's really really try to limit code adding.

@jbrockmendel jbrockmendel deleted the htests branch December 3, 2018 15:19
@jbrockmendel
Copy link
Member Author

let's really really try to limit code adding.

Sounds good. This was the last PR I had for TimedeltaArray. The only one I have left for DatetimeArray is to implement DatetimeArray._from_sequence. Following that, all I have on my plate is a) to_datetime/array_to_datetime improvements, b) parametrizing+de-duplicating arithmetic tests, and c) DTA/TDA/PA reductions. All three can wait until after 0.24.0.

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Numeric Operations Arithmetic, Comparison, and Logical operations Refactor Internal refactoring of code Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants