-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: support zoneinfo tzinfos #46425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
5344180
7786aa9
de77647
b43b72c
63860a9
02101ea
8cdc27b
edc1170
60ca7c0
0061f2c
c0bc310
0c35501
5045e45
90c0df4
6f7b7b1
e99686a
c4bed63
0c85a2f
22c2063
f3ba2b4
8dc732a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,7 @@ from datetime import ( | |
timedelta, | ||
timezone, | ||
) | ||
from zoneinfo import ZoneInfo | ||
|
||
from cpython.datetime cimport ( | ||
datetime, | ||
|
@@ -41,7 +42,7 @@ cdef int64_t NPY_NAT = get_nat() | |
cdef tzinfo utc_stdlib = timezone.utc | ||
cdef tzinfo utc_pytz = UTC | ||
cdef tzinfo utc_dateutil_str = dateutil_gettz("UTC") # NB: *not* the same as tzutc() | ||
|
||
cdef tzinfo utc_zoneinfo = ZoneInfo("UTC") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You probably don't want this to be called unconditionally. Practically speaking, this will always exist, but it's not guaranteed by the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. Is the failure mode something like "UTC" not being present in /usr/share/zoneinfo/? The user-facing downside of not doing this here is things going through slightly slower code paths. Not the end of the world, but worth avoiding if it doesn't require too much gymnastics. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, or if the user is on Windows and doesn't have I imagine it's pretty easy to make the impact of this minimal. One way to do it: cdef tzinfo utc_zoneinfo = None
cdef bool is_utc_zoneinfo(tzinfo tz):
global utc_zoneinfo
if utc_zoneinfo is None:
try:
utc_zoneinfo = ZoneInfo("UTC")
except ZoneInfoNotFoundError:
return False
return tz is utc_zoneinfo Presumably this function will get inlined wherever you call it, and in the "common case" where If you are very concerned with performance I'd probably be trying to lazy-import |
||
|
||
# ---------------------------------------------------------------------- | ||
|
||
|
@@ -51,9 +52,15 @@ cpdef inline bint is_utc(tzinfo tz): | |
or tz is utc_stdlib | ||
or isinstance(tz, _dateutil_tzutc) | ||
or tz is utc_dateutil_str | ||
# NB: we are assuming the user does not clear zoneinfo cache | ||
or tz is utc_zoneinfo | ||
) | ||
|
||
|
||
cdef inline bint is_zoneinfo(tzinfo tz): | ||
return isinstance(tz, ZoneInfo) | ||
|
||
|
||
cdef inline bint is_tzlocal(tzinfo tz): | ||
return isinstance(tz, _dateutil_tzlocal) | ||
|
||
|
@@ -210,6 +217,8 @@ cdef inline bint is_fixed_offset(tzinfo tz): | |
return 1 | ||
else: | ||
return 0 | ||
elif is_zoneinfo(tz): | ||
return False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why Also, this isn't quite right. You can tell if a This also works for This whole function can probably be simplified to: if treat_tz_as_dateutil(tz):
if len(tz._trans_idx) == 0 and len(tz._trans_list) == 0:
return 1
else:
return 0
return 0 if tz.utcoffset(None) is None else 1 Of course, even that will break when the next version of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No good reason; I'll change it to match.
Thanks. I'll probably make a dedicated branch to both Do This Right and audit our usages of is_fixed_offset, which I don't think we're very consistent about. |
||
# This also implicitly accepts datetime.timezone objects which are | ||
# considered fixed | ||
return 1 | ||
|
@@ -264,6 +273,8 @@ cdef object get_dst_info(tzinfo tz): | |
# e.g. pytz.FixedOffset, matplotlib.dates._UTC, | ||
# psycopg2.tz.FixedOffsetTimezone | ||
num = int(get_utcoffset(tz, None).total_seconds()) * 1_000_000_000 | ||
# If we have e.g. ZoneInfo here, the get_utcoffset call will return None, | ||
# so the total_seconds() call will raise AttributeError. | ||
return (np.array([NPY_NAT + 1], dtype=np.int64), | ||
np.array([num], dtype=np.int64), | ||
"unknown") | ||
|
@@ -291,13 +302,13 @@ cdef object get_dst_info(tzinfo tz): | |
# deltas | ||
deltas = np.array([v.offset for v in ( | ||
tz._ttinfo_before,) + tz._trans_idx], dtype='i8') | ||
deltas *= 1000000000 | ||
deltas *= 1_000_000_000 | ||
typ = 'dateutil' | ||
|
||
elif is_fixed_offset(tz): | ||
trans = np.array([NPY_NAT + 1], dtype=np.int64) | ||
deltas = np.array([tz._ttinfo_std.offset], | ||
dtype='i8') * 1000000000 | ||
dtype='i8') * 1_000_000_000 | ||
typ = 'fixed' | ||
else: | ||
# 2018-07-12 this is not reached in the tests, and this case | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -43,6 +43,7 @@ from pandas._libs.tslibs.timezones cimport ( | |
is_fixed_offset, | ||
is_tzlocal, | ||
is_utc, | ||
is_zoneinfo, | ||
) | ||
|
||
|
||
|
@@ -60,8 +61,8 @@ cdef int64_t tz_localize_to_utc_single( | |
elif is_utc(tz) or tz is None: | ||
return val | ||
|
||
elif is_tzlocal(tz): | ||
return _tz_convert_tzlocal_utc(val, tz, to_utc=True) | ||
elif is_tzlocal(tz) or is_zoneinfo(tz): | ||
return _tz_localize_using_tzinfo_api(val, tz, to_utc=True) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this should be the default behavior, rather than something triggered only for zoneinfo objects. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
elif is_fixed_offset(tz): | ||
# TODO: in this case we should be able to use get_utcoffset, | ||
|
@@ -136,13 +137,13 @@ timedelta-like} | |
|
||
result = np.empty(n, dtype=np.int64) | ||
|
||
if is_tzlocal(tz): | ||
if is_tzlocal(tz) or is_zoneinfo(tz): | ||
for i in range(n): | ||
v = vals[i] | ||
if v == NPY_NAT: | ||
result[i] = NPY_NAT | ||
else: | ||
result[i] = _tz_convert_tzlocal_utc(v, tz, to_utc=True) | ||
result[i] = _tz_localize_using_tzinfo_api(v, tz, to_utc=True) | ||
return result | ||
|
||
# silence false-positive compiler warning | ||
|
@@ -402,7 +403,7 @@ cdef ndarray[int64_t] _get_dst_hours( | |
# ---------------------------------------------------------------------- | ||
# Timezone Conversion | ||
|
||
cdef int64_t tz_convert_utc_to_tzlocal( | ||
cdef int64_t tz_convert_utc_to_tz( | ||
int64_t utc_val, tzinfo tz, bint* fold=NULL | ||
) except? -1: | ||
""" | ||
|
@@ -418,7 +419,7 @@ cdef int64_t tz_convert_utc_to_tzlocal( | |
------- | ||
local_val : int64_t | ||
""" | ||
return _tz_convert_tzlocal_utc(utc_val, tz, to_utc=False, fold=fold) | ||
return _tz_localize_using_tzinfo_api(utc_val, tz, to_utc=False, fold=fold) | ||
|
||
|
||
cpdef int64_t tz_convert_from_utc_single(int64_t val, tzinfo tz): | ||
|
@@ -448,8 +449,8 @@ cpdef int64_t tz_convert_from_utc_single(int64_t val, tzinfo tz): | |
|
||
if is_utc(tz): | ||
return val | ||
elif is_tzlocal(tz): | ||
return _tz_convert_tzlocal_utc(val, tz, to_utc=False) | ||
elif is_tzlocal(tz) or is_zoneinfo(tz): | ||
return _tz_localize_using_tzinfo_api(val, tz, to_utc=False) | ||
elif is_fixed_offset(tz): | ||
_, deltas, _ = get_dst_info(tz) | ||
delta = deltas[0] | ||
|
@@ -515,7 +516,7 @@ cdef const int64_t[:] _tz_convert_from_utc(const int64_t[:] vals, tzinfo tz): | |
|
||
if is_utc(tz) or tz is None: | ||
use_utc = True | ||
elif is_tzlocal(tz): | ||
elif is_tzlocal(tz) or is_zoneinfo(tz): | ||
use_tzlocal = True | ||
else: | ||
trans, deltas, typ = get_dst_info(tz) | ||
|
@@ -539,7 +540,7 @@ cdef const int64_t[:] _tz_convert_from_utc(const int64_t[:] vals, tzinfo tz): | |
# The pattern used in vectorized.pyx checks for use_utc here, | ||
# but we handle that case above. | ||
if use_tzlocal: | ||
converted[i] = _tz_convert_tzlocal_utc(val, tz, to_utc=False) | ||
converted[i] = _tz_localize_using_tzinfo_api(val, tz, to_utc=False) | ||
elif use_fixed: | ||
converted[i] = val + delta | ||
else: | ||
|
@@ -551,11 +552,12 @@ cdef const int64_t[:] _tz_convert_from_utc(const int64_t[:] vals, tzinfo tz): | |
|
||
# OSError may be thrown by tzlocal on windows at or close to 1970-01-01 | ||
# see https://github.com/pandas-dev/pandas/pull/37591#issuecomment-720628241 | ||
cdef int64_t _tz_convert_tzlocal_utc(int64_t val, tzinfo tz, bint to_utc=True, | ||
bint* fold=NULL) except? -1: | ||
cdef int64_t _tz_localize_using_tzinfo_api( | ||
int64_t val, tzinfo tz, bint to_utc=True, bint* fold=NULL | ||
) except? -1: | ||
""" | ||
Convert the i8 representation of a datetime from a tzlocal timezone to | ||
UTC, or vice-versa. | ||
Convert the i8 representation of a datetime from a general-case timezone to | ||
UTC, or vice-versa using the datetime/tzinfo API. | ||
|
||
Private, not intended for use outside of tslibs.conversion | ||
|
||
|
@@ -564,10 +566,10 @@ cdef int64_t _tz_convert_tzlocal_utc(int64_t val, tzinfo tz, bint to_utc=True, | |
val : int64_t | ||
tz : tzinfo | ||
to_utc : bint | ||
True if converting tzlocal _to_ UTC, False if going the other direction | ||
True if converting _to_ UTC, False if going the other direction. | ||
fold : bint*, default NULL | ||
pointer to fold: whether datetime ends up in a fold or not | ||
after adjustment | ||
after adjustment. | ||
Only passed with to_utc=False. | ||
|
||
Returns | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unconditional import — do you not require support for Python < 3.8?
If so, you should presumably have some more complicated logic here and in the
is_zoneinfo
logic.Something like this works well for "is this an instance of X" without forcing the user to import the module that contains X:
https://github.com/pganssle/pytz-deprecation-shim/blob/bf0174adfce698e280c623d17fe8e82961de0c48/src/pytz_deprecation_shim/helpers.py#L11-L30
https://github.com/pganssle/pytz-deprecation-shim/blob/bf0174adfce698e280c623d17fe8e82961de0c48/src/pytz_deprecation_shim/_common.py#L6-L13
Since obviously some object isn't going to be an instance of
zoneinfo.ZoneInfo
if the interpreter has never imported thezoneinfo
module.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our minimum version is 3.8, but I'm on 3.9 locally and forgot when writing this branch. Will change to try/except.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no leave this
we support >= 3.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zoneinfo isnt stdlib until 3.9, so importing unconditionally would require making backports.zoneinfo a hard dependency (and having that in the CI dep file is causing problems AFAICT)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with
backports.zoneinfo
it needs to be conditional since they're in different namespaces, plus I really like the approach I took in thepytz
deprecation shims for checking if a time zone is apytz
zone, where my checks never actually importpytz
unless it's already been imported (since obviously if it's never been imported then I know the object isn't apytz
type). That may be overkill, but it's also not that hard to implement, so ⚖️ 🤷There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to take a closer look at the pytz deprecation shim. We're getting rid of a lot of pytz usage in 2.0 (#34916) and these days I keep finding ways we could simplify the code even more if we dropped pytz support altogether.