Skip to content

#31793 Add support for subtracting datetime from Timestamp #37329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 28, 2021

Conversation

AnjoMan
Copy link
Contributor

@AnjoMan AnjoMan commented Oct 22, 2020

Based on the discussion in the attached issue, i've made the following changes:

  1. add support for subtracting when the two items have different timezones
  2. raise an error if one (but not both) timezone is naive

The root cause of 31793 was that the tzinfo objects were from different libraries -- Timestamp makes a pytz.UTC whereas a datetime object might just have datetime.timezone.utc. These objects are not necessarily equal, even if they represent the same timezone.

In addition, this change adds support for time-deltas where the timezone of each date is not the same. This follows the behaviour of datetime more closely.

I reference the python datetime module documentation which defines when a datetime is naive: when its tzinfo is None or a call to item.tzinfo.utcoffset(item) returns None.

@AnjoMan
Copy link
Contributor Author

AnjoMan commented Oct 22, 2020

If this is a welcome change, please add the hacktoberfest-accepted to this issue so it can go towards my event contribution

@AnjoMan
Copy link
Contributor Author

AnjoMan commented Oct 22, 2020

Ok, looking at the failed tests, it l;ooks like i should be relocating the tests i added so they would appear under tests/arithmetic/test_timedetlta64.py and tests/scalar/timestamp/test_arithmetic.py.

Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AnjoMan for the PR!

Some comments

@arw2019
Copy link
Member

arw2019 commented Oct 22, 2020

Also heads up about linting - needs to be fixed for the CI checks to pass

@jbrockmendel
Copy link
Member

When changing this for Timestamp, we'll need to make the same change for DatetimeArray

@AnjoMan
Copy link
Contributor Author

AnjoMan commented Oct 22, 2020

When changing this for Timestamp, we'll need to make the same change for DatetimeArray

@jbrockmendel thanks for the heads-up, I'll take a look and try to replicate this functionality there as well.

Just so we're clear, you are expecting to find the same bug when doing something like

date_array_1 = DatetimeArray(...)
date_array_2 = DatetimeArray(...)

date_array_1 - date_array_2

and one of the arrays has datetime objects instead of Timestamp objects?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I -1 on adding different timezone subtraction operations. We do not allow this anywhere else. I think being explicit is a really good thing here.

@jreback jreback added the Timezones Timezone data dtype label Oct 23, 2020
@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from 5712eab to 298431f Compare October 23, 2020 03:38
@pep8speaks
Copy link

pep8speaks commented Oct 23, 2020

Hello @AnjoMan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-12-28 15:51:50 UTC

@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from 298431f to 6f11284 Compare October 23, 2020 03:52
@jbrockmendel
Copy link
Member

Just so we're clear, you are expecting to find the same bug when doing something like

date_array_1 = DatetimeArray(...)
date_array_2 = DatetimeArray(...)
date_array_1 - date_array_2

and one of the arrays has datetime objects instead of Timestamp objects?

datetime objects vs Timestamp objects isn't really applicable here. Think of DatetimeArray as being a vectorized Timestamp. Just like with Timestamp, when subtracting dta - dta2 we need to have not ((dta.tzinfo is None) ^ (dta2.tzinfo is None)). ATM we also require that the tzinfos match, which this PR would cease to require.

@jorisvandenbossche
Copy link
Member

[@jreback] I am -1 on adding different timezone subtraction operations. We do not allow this anywhere else. I think being explicit is a really good thing here.

Apart from the fact that the stdlib also supports this as @jbrockmendel mentioned, I also want to point out we actually do support differing timezones in several operations, eg comparison ops:

In [5]: t1 = pd.Timestamp("2012-01-01").tz_localize("UTC").tz_convert("Europe/Brussels")

In [6]: t2 = pd.Timestamp("2012-01-01").tz_localize("UTC").tz_convert("US/Eastern")

In [7]: t1 == t2
Out[7]: True

In [8]: t1 >= t2
Out[8]: True

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Nov 25, 2020
Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In light of this discussions in #37605 does this PR go ahead? AFAICT yes but maybe I misunderstood @jbrockmendel @jorisvandenbossche @jreback @mroeschke

If yes @AnjoMan you will need to merge master and add a whatsnew note. Not sure if this will get in in time for 1.2 - if not we'll have to wait until we tag the next major release (1.3 or 2.0)

@jbrockmendel
Copy link
Member

In light of this discussions in #37605 does this PR go ahead? AFAICT yes but maybe I misunderstood

@jorisvandenbossche and I are +1 on this, @jreback was -1 though there's some evidence of miscommunication, and @mroeschke was -1 though I'm hoping he'll reconsider when looking at this context.

@arw2019 @AnjoMan I encourage you both to weigh in on #37605.

@mroeschke
Copy link
Member

I'm more -0 just in principle, but since we already support some mixed timezone aware operations, it's not a hill I will die on.

@arw2019
Copy link
Member

arw2019 commented Nov 27, 2020

In light of this discussions in #37605 does this PR go ahead? AFAICT yes but maybe I misunderstood

@jorisvandenbossche and I are +1 on this, @jreback was -1 though there's some evidence of miscommunication, and @mroeschke was -1 though I'm hoping he'll reconsider when looking at this context.

@arw2019 @AnjoMan I encourage you both to weigh in on #37605.

I'm +1 on adding this

@jreback
Copy link
Contributor

jreback commented Feb 11, 2021

@AnjoMan if you can rebase we can take another look here

@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch 2 times, most recently from 1d3fa48 to e02cdce Compare February 11, 2021 02:25
@AnjoMan
Copy link
Contributor Author

AnjoMan commented Feb 11, 2021

Ok, i've rebased on the current dev branch and added a "whatsnew" entry.

I noticed one of the CI checks fails with a bad import -- it doesn't seem too related to what I've changed

image

@arw2019
Copy link
Member

arw2019 commented Feb 11, 2021

Ok, i've rebased on the current dev branch and added a "whatsnew" entry.

I noticed one of the CI checks fails with a bad import -- it doesn't seem too related to what I've changed

image

will be fixed by #39744

@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from e02cdce to 7f48b34 Compare February 22, 2021 22:16
@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from 27abdcc to c6de9d2 Compare October 20, 2021 00:09
@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from c6de9d2 to 26e40c7 Compare October 28, 2021 02:33
@AnjoMan
Copy link
Contributor Author

AnjoMan commented Oct 28, 2021

Ok, i've updated the PR to reflect new feedback from @jbrockmendel and moved the 'whatsnew' entry to 1.4.0

Copy link
Member

@jbrockmendel jbrockmendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM cc @jreback

@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from 26e40c7 to c195387 Compare November 1, 2021 23:56
@jbrockmendel
Copy link
Member

@AnjoMan can you merge master one more time

@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from c195387 to 944dad6 Compare December 7, 2021 00:50
@AnjoMan
Copy link
Contributor Author

AnjoMan commented Dec 7, 2021

@jbrockmendel i've updated the branch against pandas master

@jbrockmendel
Copy link
Member

gentle ping @jreback

@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from 944dad6 to 498efcc Compare December 21, 2021 04:29
@jreback jreback added this to the 1.4 milestone Dec 28, 2021
@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions labels Dec 28, 2021
@jreback
Copy link
Contributor

jreback commented Dec 28, 2021

ok on board here. @AnjoMan one more rebase if you would and @mroeschke if any comments.

@AnjoMan AnjoMan force-pushed the 31793-timestamp-subtraction-bug branch from 6e72b91 to ceb2426 Compare December 28, 2021 15:51
@AnjoMan
Copy link
Contributor Author

AnjoMan commented Dec 28, 2021

i've rebased on top of c3d3357

@mroeschke mroeschke merged commit 5ae855d into pandas-dev:master Dec 28, 2021
@mroeschke
Copy link
Member

Thanks for hanging in there @AnjoMan. Great work!

@jbrockmendel
Copy link
Member

nice!

@AnjoMan
Copy link
Contributor Author

AnjoMan commented Dec 28, 2021

thanks @jbrockmendel @jreback @mroeschke for all the feedback and help getting this together!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Difference between two Timestamp with different timezone object type fails
10 participants