Skip to content

Fix name setting in DTI/TDI __add__ and __sub__ #19744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 21, 2018

Conversation

jbrockmendel
Copy link
Member

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@codecov
Copy link

codecov bot commented Feb 17, 2018

Codecov Report

Merging #19744 into master will increase coverage by <.01%.
The diff coverage is 96.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19744      +/-   ##
==========================================
+ Coverage   91.61%   91.61%   +<.01%     
==========================================
  Files         150      150              
  Lines       48887    48880       -7     
==========================================
- Hits        44786    44783       -3     
+ Misses       4101     4097       -4
Flag Coverage Δ
#multiple 89.99% <96.61%> (ø) ⬆️
#single 41.81% <35.59%> (+0.01%) ⬆️
Impacted Files Coverage Δ
pandas/core/common.py 93.48% <ø> (+0.06%) ⬆️
pandas/core/indexes/period.py 93.04% <100%> (ø) ⬆️
pandas/core/indexes/datetimes.py 95.56% <100%> (+0.24%) ⬆️
pandas/core/indexes/timedeltas.py 90.63% <100%> (+0.06%) ⬆️
pandas/core/series.py 94.46% <100%> (ø) ⬆️
pandas/core/ops.py 96.65% <95.23%> (-0.1%) ⬇️
pandas/core/indexes/datetimelike.py 97.1% <96%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 740ad9a...249b7ba. Read the comment docs.

@@ -358,17 +358,13 @@ def _maybe_update_attributes(self, attrs):
def _add_delta(self, delta):
if isinstance(delta, (Tick, timedelta, np.timedelta64)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should prob add a doc-string to these also making the guarantee that the name should not be set

@@ -726,6 +736,11 @@ def __sub__(self, other):
else: # pragma: no cover
return NotImplemented

if result is not NotImplemented:
res_name = ops._get_series_op_result_name(self, other)
result = result.rename(name=res_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this copies again, just set the name

@@ -726,6 +736,11 @@ def __sub__(self, other):
else: # pragma: no cover
return NotImplemented

if result is not NotImplemented:
res_name = ops._get_series_op_result_name(self, other)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename this to

get_op_result_name


if isinstance(delta, (Tick, timedelta, np.timedelta64)):
new_values = self._add_delta_td(delta)
elif is_timedelta64_dtype(delta):
if not isinstance(delta, TimedeltaIndex):
delta = TimedeltaIndex(delta)
else:
# update name when delta is Index
name = com._maybe_match_name(self, delta)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move _maybe_match_name to ops and put next to get_op_result_name (and should remove it in favor of get_op_result_name, but that might be slightly tricky).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. While I'm at it I'm going to move get_op_result_name with the utility functions near the top of the file instead of in the arithmetic-specific spot it is now.

# GH#18824
dti = pd.date_range('2017-01-01', periods=2, tz=tz)
other = box([pd.offsets.MonthEnd(), pd.offsets.Day(n=2)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are you removing all of the boxing? do these not work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

box is parametrized over [np.array, pd.Index], but the pd.Index case is being separated out into its own test that also checks names.

@jreback jreback added Compat pandas objects compatability with Numpy or Python functions Datetime Datetime data dtype Timedelta Timedelta data type labels Feb 18, 2018
return name


def _maybe_match_name(a, b):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this as a standalone? (IOW can you replace the current usage with get_op_result_name)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not can you doc-string this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the uses can be replaced without affecting any existing behavior. I'll see if I can get them all in a way that doesn't necessitate new tests. If not I'll kill it off in the next pass.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, all non-test usages can be replaced. I've done that and then also added a docstring. Getting rid of it and updating the tests to test get_op_result_name directly will wait for another round with narrower scope.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok looks good. next pass let's try to remove this (I think you did, just need to transfer the tests)

@@ -1814,7 +1814,7 @@ def combine_first(self, other):
this = self.reindex(new_index, copy=False)
other = other.reindex(new_index, copy=False)
# TODO: do we need name?
name = com._maybe_match_name(self, other) # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyidea why the noqa is here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd speculate that it's because name is defined and then for some reason not used, so flake8 should complain.

@@ -726,6 +736,11 @@ def __sub__(self, other):
else: # pragma: no cover
return NotImplemented

if result is not NotImplemented:
res_name = ops.get_op_result_name(self, other)
result.rename(name=res_name, inplace=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simply set .name (and same above)

return self + other[0]
else:
warnings.warn("Adding/subtracting array of DateOffsets to "
"{} not vectorized".format(type(self)),
PerformanceWarning)
return self.astype('O') + np.array(other)
# TODO: pass freq='infer' like we do in _sub_offset_array?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the purpose of these comments (esp here, after the return)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) to make sure it gets seen during review, b) as a note to myself or the next pass

Series([1], name='x'), Series(
[2], name='x'))
assert (matched == 'x')

matched = com._maybe_match_name(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next pass, move these to test_ops.py (new)

@jbrockmendel
Copy link
Member Author

Once this goes through the remaining cases on todo list are: array[offsets] for PeriodIndex, array[int], and array[timedelta64], and last some object-dtype corner cases.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple items for next pass. merging.

return name


def _maybe_match_name(a, b):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok looks good. next pass let's try to remove this (I think you did, just need to transfer the tests)

@@ -1814,7 +1814,7 @@ def combine_first(self, other):
this = self.reindex(new_index, copy=False)
other = other.reindex(new_index, copy=False)
# TODO: do we need name?
name = com._maybe_match_name(self, other) # noqa
name = ops.get_op_result_name(self, other) # noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clearly this is not used, so on next pass can you remove

@jreback jreback added this to the 0.23.0 milestone Feb 21, 2018
@jreback jreback merged commit 80241e6 into pandas-dev:master Feb 21, 2018
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
@jbrockmendel jbrockmendel deleted the dtlike branch June 22, 2018 03:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Datetime Datetime data dtype Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants