Skip to content

Assert at least one tz arg is always UTC #18228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 12, 2017

Conversation

jbrockmendel
Copy link
Member

closes #17734

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@gfyoung gfyoung added Bug Timezones Timezone data dtype labels Nov 11, 2017
@@ -422,6 +422,9 @@ cpdef int64_t tz_convert_single(int64_t val, object tz1, object tz2):
pandas_datetimestruct dts
datetime dt

assert (is_utc(tz1) or tz1 == 'UTC') or (is_utc(tz2) or tz2 == 'UTC')
# See GH#17734
Copy link
Member

@gfyoung gfyoung Nov 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Let's the bug the comment above the line.
  • Perhaps a one sentence explanation for this assertion would be good too.
  • If a user were to stumble upon this assert, wouldn't we want some kind of error message?

@@ -487,6 +490,9 @@ def tz_convert(ndarray[int64_t] vals, object tz1, object tz2):
pandas_datetimestruct dts
datetime dt

assert (is_utc(tz1) or tz1 == 'UTC') or (is_utc(tz2) or tz2 == 'UTC')
# See GH#17734
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@@ -445,7 +448,7 @@ cpdef int64_t tz_convert_single(int64_t val, object tz1, object tz2):
if get_timezone(tz2) == 'UTC':
return utc_date
if is_tzlocal(tz2):
dt64_to_dtstruct(val, &dts)
dt64_to_dtstruct(utc_date, &dts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh? this seems like it would break something, if it doesn't that's a problem

Copy link
Member Author

@jbrockmendel jbrockmendel Nov 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug here is an inconsistency between tz_convert and tz_convert_single. The edit here makes tz_convert_single behave the same as tz_convert.

The reason why this hasn't broken anything is because in all existing usages+tests of tz_convert_single, one of tz1 or tz2 is UTC. As long as this condition holds, the two versions of the code are equivalent. This PR adds an assertion that this condition holds.

Of course, assuming the tz_convert implementation is correct, once the implementation here is changed, the assertion is no longer necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one of my comments is gone. remove the first clause else (which just sets utc_date = val, then this is no longer necessary.

@codecov
Copy link

codecov bot commented Nov 12, 2017

Codecov Report

❗ No coverage uploaded for pull request base (master@40fd6b4). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #18228   +/-   ##
=========================================
  Coverage          ?   91.39%           
=========================================
  Files             ?      163           
  Lines             ?    50091           
  Branches          ?        0           
=========================================
  Hits              ?    45779           
  Misses            ?     4312           
  Partials          ?        0
Flag Coverage Δ
#multiple 89.2% <ø> (?)
#single 40.36% <ø> (?)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40fd6b4...754dd7b. Read the comment docs.

@@ -445,7 +448,7 @@ cpdef int64_t tz_convert_single(int64_t val, object tz1, object tz2):
if get_timezone(tz2) == 'UTC':
return utc_date
if is_tzlocal(tz2):
dt64_to_dtstruct(val, &dts)
dt64_to_dtstruct(utc_date, &dts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one of my comments is gone. remove the first clause else (which just sets utc_date = val, then this is no longer necessary.

@@ -445,7 +448,7 @@ cpdef int64_t tz_convert_single(int64_t val, object tz1, object tz2):
if get_timezone(tz2) == 'UTC':
return utc_date
if is_tzlocal(tz2):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elif

@@ -445,7 +448,7 @@ cpdef int64_t tz_convert_single(int64_t val, object tz1, object tz2):
if get_timezone(tz2) == 'UTC':
return utc_date
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return val (see my comment below)

@jbrockmendel
Copy link
Member Author

remove the first clause else (which just sets utc_date = val, then this is no longer necessary.

I think there may be some confusion about the underlying issue. In the case where utc_date = val, you're right that the change in line 448/451 is irrelevant. The problem is that in the case where utc_date != val, the change here may matter.

The new assertion here is a band-aid that ensures we never get to cases where the change here matters. Ideally we shouldn't need that assertion.

The status-quo use of val in line 448 is an outlier that does not fit with either a) the rest of this function or b) the implementation in tz_convert. The latter is the bug in question. Changing val --> utc_date in line 451 is very much a good thing.

@jreback jreback added this to the 0.22.0 milestone Nov 12, 2017
@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

ok ping on green.

note you may wish to switch / use rebasing rather than merging as its much cleaner.

@jbrockmendel
Copy link
Member Author

ok ping on green.

The PR in its current state is a win because it avoids incorrect behavior, but if we can confirm that the usage in tz_convert (which is here the new usage in tz_convert_single) is correct, then we can get rid of the assertion.

On the flip side, if we cant confirm that the usage in tz_convert is correct, then we have an altogether different problem.

note you may wish to switch / use rebasing rather than merging as its much cleaner.

This distinction has come up over in statsmodels. I've just been running "git pull upstream master" whenever a rebase is needed. I'll try to figure out the grown-up way of doing things.

@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

This distinction has come up over in statsmodels. I've just been running "git pull upstream master" whenever a rebase is needed. I'll try to figure out the grown-up way of doing things.

when we merge it doesn't matter as things get squashed to a single commit.

its cleaner for reviewing; every little commit doesn't matter. Logical commits matter.

@jreback jreback merged commit 5e553ce into pandas-dev:master Nov 12, 2017
@jreback
Copy link
Contributor

jreback commented Nov 12, 2017

thanks @jbrockmendel

@jreback jreback mentioned this pull request Nov 22, 2017
4 tasks
No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017
@jbrockmendel jbrockmendel deleted the tslibs-fix_tzconvert branch December 8, 2017 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistency in tz_convert_single?
3 participants