Skip to content

BUG: DatetimeIndex constructed with Timestamps on DST border are converted to the same Timestamp #20854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mroeschke opened this issue Apr 28, 2018 · 2 comments · Fixed by #21407
Labels
Bug Timezones Timezone data dtype
Milestone

Comments

@mroeschke
Copy link
Member

mroeschke commented Apr 28, 2018

In [16]: pd.DatetimeIndex([pd.Timestamp('2016-10-30 03:00:00+0300', tz='Europe/Helsinki'), pd.Timestamp('2016-10-30 03:00:00+0200', tz='Europe/Helsinki')])
Out[16]: DatetimeIndex(['2016-10-30 03:00:00+02:00', '2016-10-30 03:00:00+02:00'], dtype='datetime64[ns, Europe/Helsinki]', freq=None)

For Helsinki, DST "falls back" on 2016-10-30 and therefore03:00:00 occurs twice with 2 different UTC offsets (+0300 and +0200). When constructing the DatetimeIndex above, the UTC offset of the first argument is incorrectly converted to the offset after the DST transition.

Expected:

Out[16]: DatetimeIndex(['2016-10-30 03:00:00+03:00', '2016-10-30 03:00:00+02:00'], dtype='datetime64[ns, Europe/Helsinki]', freq=None)
@mroeschke mroeschke changed the title BUG: DatetimeIndex constructed with Timestamps on DST border is converted to the same Timestamp BUG: DatetimeIndex constructed with Timestamps on DST border are converted to the same Timestamp Apr 28, 2018
@mroeschke
Copy link
Member Author

So it appears that this is actually a repr issue, as the resulting DatetimeIndex has the same integer representation as the Timestamps.

When constructing the repr here:

if tz is not None or not isinstance(x, Timestamp):
x = Timestamp(x, tz=tz)

Essentially this happens:

In [3]: pd.Timestamp(pd.Timestamp('2016-10-30 03:00:00+0300', tz='Europe/Helsinki'), tz='Europe/Helsinki')
Out[3]: Timestamp('2016-10-30 03:00:00+0200', tz='Europe/Helsinki')

Which is a separate but related bug.

@mroeschke
Copy link
Member Author

I think this example stems from the same bug:

In [11]: d = pd.date_range(start='2010-11-7', periods=24, freq='H', tz='US/Pacific')

In [12]: d
Out[12]:
DatetimeIndex(['2010-11-07 00:00:00-07:00', '2010-11-07 01:00:00-08:00', <-- this should be -07:00
               '2010-11-07 01:00:00-08:00', '2010-11-07 02:00:00-08:00',
               '2010-11-07 03:00:00-08:00', '2010-11-07 04:00:00-08:00',
               '2010-11-07 05:00:00-08:00', '2010-11-07 06:00:00-08:00',
               '2010-11-07 07:00:00-08:00', '2010-11-07 08:00:00-08:00',
               '2010-11-07 09:00:00-08:00', '2010-11-07 10:00:00-08:00',
               '2010-11-07 11:00:00-08:00', '2010-11-07 12:00:00-08:00',
               '2010-11-07 13:00:00-08:00', '2010-11-07 14:00:00-08:00',
               '2010-11-07 15:00:00-08:00', '2010-11-07 16:00:00-08:00',
               '2010-11-07 17:00:00-08:00', '2010-11-07 18:00:00-08:00',
               '2010-11-07 19:00:00-08:00', '2010-11-07 20:00:00-08:00',
               '2010-11-07 21:00:00-08:00', '2010-11-07 22:00:00-08:00'],
              dtype='datetime64[ns, US/Pacific]', freq='H')

In [13]: d.time
Out[13]:
array([datetime.time(0, 0, tzinfo=<DstTzInfo 'US/Pacific' PDT-1 day, 17:00:00 DST>),
       datetime.time(1, 0, tzinfo=<DstTzInfo 'US/Pacific' PDT-1 day, 17:00:00 DST>),
       datetime.time(1, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(2, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(3, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(4, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(5, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(6, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(7, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(8, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(9, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(10, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(11, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(12, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(13, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(14, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(15, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(16, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(17, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(18, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(19, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(20, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(21, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>),
       datetime.time(22, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:00 STD>)],
      dtype=object)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants