Skip to content

BUG: Time zone information lost for some dateutil time zones #9663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
miketkelly opened this issue Mar 16, 2015 · 3 comments
Closed

BUG: Time zone information lost for some dateutil time zones #9663

miketkelly opened this issue Mar 16, 2015 · 3 comments
Labels
Compat pandas objects compatability with Numpy or Python functions Timezones Timezone data dtype
Milestone

Comments

@miketkelly
Copy link

The dateutil package allows you to create time zone (tzfile) objects two ways, either by using dateutil.tz.gettz to read time zone data on the file system (/usr/share/zoneinfo), or by using dateutil.zoneinfo.gettz to read time zone data from a tar file distributed in the dateutil package.

The tslib.maybe_get_tz function doesn't handle the dateutil.tz.gettz variant.

from datetime import datetime

import pandas as pd
import pandas.tslib as tslib

import dateutil.tz
import dateutil.zoneinfo

tz1 = dateutil.tz.gettz('America/New_York')
tz2 = dateutil.zoneinfo.gettz('America/New_York')

d1 = datetime(2015, 1, 1, tzinfo=tz1)
d2 = datetime(2015, 1, 1, tzinfo=tz2)

maybe_get_tz returns None for tz1, but works correctly for tz2:

>>> tslib.maybe_get_tz('dateutil/' + tz1._filename)
>>> tslib.maybe_get_tz('dateutil/' + tz2._filename)
tzfile('America/New_York')

And so DatetimeIndexes are missing time zone information for those cases.

>>> pd.to_datetime([d1])
<class 'pandas.tseries.index.DatetimeIndex'>
[2015-01-01 05:00:00]
Length: 1, Freq: None, Timezone: None
>>> pd.to_datetime([d2])
<class 'pandas.tseries.index.DatetimeIndex'>
[2015-01-01 00:00:00-05:00]
Length: 1, Freq: None, Timezone: tzfile('America/New_York')

I think if maybe_get_tz where to first try dateutil.zoneinfo.gettz, and then fall back on dateutil.tz.gettz, then the problem is solved.

This was a regression between 0.14 and 0.15.

@jreback
Copy link
Contributor

jreback commented Mar 17, 2015

see #9123 for an open pull-request to 'fix' this.

and these issues:

dateutil/dateutil#8
dateutil/dateutil#11
xref #9059
xref #8639

I don't actually think this is a regression in pandas itself. Dateutil supported various ways of accessing the tz's which differed on different platforms. So would love to have someone work/on resolve this. #9123 is almost there.

@jreback jreback added Timezones Timezone data dtype Compat pandas objects compatability with Numpy or Python functions labels Mar 17, 2015
@jreback jreback added this to the 0.16.1 milestone Mar 17, 2015
@jreback jreback modified the milestones: 0.17.0, 0.16.1 Apr 21, 2015
jlec added a commit to jlec/pandas that referenced this issue May 15, 2015
python-dateutil provides two implementations for gettz(), tz.gettz() and
zoneinfo.gettz(). The former tries first to use system provided timezone data,
where as the later always uses a bundled tarball. Upstreams recommandation
for library consumers is only using tz.gettz() (1 & 2). Further more, on
system which do not install the zoninfo tarball (e.g. Debian, Gentoo and
Fedora) but rely on the system zoneinfo files the direct usage of
zoneinfo.gettz() creates problems which result in test failures (3 - 6).

For compatibility in pandas code

    pandas.tslib._dateutil_gettz()

should be used.

1 dateutil/dateutil#8
2 dateutil/dateutil#11
3 pandas-dev#9059
4 pandas-dev#8639
5 pandas-dev#10121
6 pandas-dev#9663

Signed-off-by: Justin Lecher <[email protected]>
jlec added a commit to jlec/pandas that referenced this issue May 15, 2015
python-dateutil provides two implementations for gettz(), tz.gettz() and
zoneinfo.gettz(). The former tries first to use system provided timezone data,
where as the later always uses a bundled tarball. Upstreams recommandation
for library consumers is only using tz.gettz() (1 & 2). Further more, on
system which do not install the zoninfo tarball (e.g. Debian, Gentoo and
Fedora) but rely on the system zoneinfo files the direct usage of
zoneinfo.gettz() creates problems which result in test failures (3 - 6).

For compatibility in pandas code

    pandas.tslib._dateutil_gettz()

should be used.

1 dateutil/dateutil#8
2 dateutil/dateutil#11
3 pandas-dev#9059
4 pandas-dev#8639
5 pandas-dev#10121
6 pandas-dev#9663

Signed-off-by: Justin Lecher <[email protected]>
jlec added a commit to jlec/pandas that referenced this issue May 15, 2015
python-dateutil provides two implementations for gettz(), tz.gettz() and
zoneinfo.gettz(). The former tries first to use system provided timezone data,
where as the later always uses a bundled tarball. Upstreams recommandation
for library consumers is only using tz.gettz() (1 & 2). Further more, on
system which do not install the zoninfo tarball (e.g. Debian, Gentoo and
Fedora) but rely on the system zoneinfo files the direct usage of
zoneinfo.gettz() creates problems which result in test failures (3 - 6).

For compatibility in pandas code

    pandas.tslib._dateutil_gettz()

should be used.

1 dateutil/dateutil#8
2 dateutil/dateutil#11
3 pandas-dev#9059
4 pandas-dev#8639
5 pandas-dev#10121
6 pandas-dev#9663

Signed-off-by: Justin Lecher <[email protected]>
@jreback
Copy link
Contributor

jreback commented Jul 17, 2015

@miketkelly

we merged #9123

so is this relevant any longer?

@jreback
Copy link
Contributor

jreback commented Aug 20, 2015

@miketkelly pls reopen if this is still an issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

2 participants