-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG/TST: Add more timezone fixtures and use is_utc more consistently #23807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @mroeschke! Thanks for submitting the PR.
|
@@ -1248,6 +1248,8 @@ Timezones | |||
- Bug when indexing a :class:`Series` with a DST transition (:issue:`21846`) | |||
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` where an ``AmbiguousTimeError`` or ``NonExistentTimeError`` would raise if a timezone aware timeseries ended on a DST transition (:issue:`19375`, :issue:`10117`) | |||
- Bug in :meth:`DataFrame.drop` and :meth:`Series.drop` when specifying a tz-aware Timestamp key to drop from a :class:`DatetimeIndex` with a DST transition (:issue:`21761`) | |||
- Bug in :class:`DatetimeIndex` constructor where :class:`NaT` and ``dateutil.tz.tzlocal`` would raise an ``OutOfBoundsDatetime`` error (:issue:`23807`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I'm guessing this happens for users for whom tzlocal is UTC-N but not those for whom tzlocal is UTC+N (@jorisvandenbossche can you check this when convenient?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually because NaT
wasn't handled in tz_localize_to_utc
so NaT
was converted to its i8 value which was less than Timestamp.min
pandas/_libs/tslibs/conversion.pyx
Outdated
@@ -664,7 +670,8 @@ cdef inline int64_t[:] _tz_convert_dst(int64_t[:] values, tzinfo tz, | |||
|
|||
|
|||
cdef inline int64_t _tz_convert_tzlocal_utc(int64_t val, tzinfo tz, | |||
bint to_utc=True): | |||
bint to_utc=True, | |||
bint from_utc=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't from_utc
just equal to not to_utc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm it appears that way in general. I can try to remove the new keyword arg for not to_utc
@@ -354,7 +354,8 @@ def unique_nulls_fixture(request): | |||
|
|||
|
|||
TIMEZONES = [None, 'UTC', 'US/Eastern', 'Asia/Tokyo', 'dateutil/US/Pacific', | |||
'dateutil/Asia/Singapore'] | |||
'dateutil/Asia/Singapore', tzutc(), tzlocal(), FixedOffset(300), | |||
FixedOffset(0), FixedOffset(-300)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brainstorm:
- non-round numbers for
FixedOffset
- we have a small number of tests with tzinfos from
psycopg2.tz
andmatplotlib
(not sure where in the matplotlib namespace) - random subset of
pytz.all_timezones
(592 of them) - equivalent from
dateutil.tz
(not sure off the top of my head how to access that, but if we figure it out I'll make a PR there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how much additional time do the tests take with these added fixtures? how many more tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Sure we can add one non-round FixedOffset
- matplotlib UTC is just
datetime.timezones.utc
: https://github.com/matplotlib/matplotlib/blob/9e8be5b0dc0a03f569d7a248f67ef136520adbbb/lib/matplotlib/dates.py#L172 - IIRC we just convert
psycopg2.tz
timezones (from sql functions) directly to UTC, and we don't directly advertise accepting them in other functions. But I guess we do have code that supposedly handles this:pandas/pandas/_libs/tslibs/timezones.pyx
Line 218 in 3702de2
# psycopg2.tz.FixedOffsetTimezone - As for more pytz and dateutil timezones, I'd be most interested in testing non-conventional timezones, e.g. odd time shifting, but I feel its not necessary to test all of them (baring testing all of them doesn't take that much additional time)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add these in a follow up and further discuss.
@@ -8,6 +8,7 @@ from numpy cimport int64_t, int32_t, ndarray | |||
cnp.import_array() | |||
|
|||
import pytz | |||
from dateutil.tz import tzutc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you switching this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No switching, this is just used here for timezone conversion between tzlocal and tzutc (making sure they are both dateutil timezones so they play nice) https://github.com/pandas-dev/pandas/pull/23807/files/de3c83bb946a7791d29a18db5f004228b24c9b02#diff-66d289312fd0b02e7721bf45fb281797R705
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -354,7 +354,8 @@ def unique_nulls_fixture(request): | |||
|
|||
|
|||
TIMEZONES = [None, 'UTC', 'US/Eastern', 'Asia/Tokyo', 'dateutil/US/Pacific', | |||
'dateutil/Asia/Singapore'] | |||
'dateutil/Asia/Singapore', tzutc(), tzlocal(), FixedOffset(300), | |||
FixedOffset(0), FixedOffset(-300)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how much additional time do the tests take with these added fixtures? how many more tests
@@ -409,8 +409,8 @@ def test_to_datetime_tz_pytz(self, cache): | |||
hour=3, minute=0))], | |||
dtype=object) | |||
result = pd.to_datetime(arr, utc=True, cache=cache) | |||
expected = DatetimeIndex(['2000-01-01 08:00:00+00:00', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this surprising that you change this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yeah this was the wrong fix. Addressed
Codecov Report
@@ Coverage Diff @@
## master #23807 +/- ##
==========================================
+ Coverage 92.28% 92.28% +<.01%
==========================================
Files 161 161
Lines 51500 51501 +1
==========================================
+ Hits 47528 47529 +1
Misses 3972 3972
Continue to review full report at Codecov.
|
looks good. does this add a lot more testing? e.g. now that we are having a bigger tz matrix (and use this fixture a fair amount) |
Looking at two, 3.6 travis build, we will go from 36028 tests to 37606 tests (783.49 seconds to 931.50 seconds). |
thanks! of course if this does close any existing issues, pls update them. |
git diff upstream/master -u -- "*.py" | flake8 --diff
Was discussed a while back with @jreback to add more timezones to the
tz_aware_fixture
, added some less tested ones like:dateutil.tz.tzlocal
pytz.FixedOffset
This uncovered some bugs with
tzlocal
especially near DST transitions.Additionally, using
is_utc
more consistently for UTC checking across the codebase and no longer passing the string'UTC'
around. Instead the UTC timezone object (i.e.pytz.UTC
ordateutil.tz.tzutc()
) is passed along. However, for single value conversion to/from UTC,pytz.UTC
is used by default.