-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Do not round DatetimeIndex nanosecond precision when iterating #19628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #19628 +/- ##
=======================================
Coverage 91.58% 91.58%
=======================================
Files 150 150
Lines 48806 48806
=======================================
Hits 44701 44701
Misses 4105 4105
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we would need comprehensive support for datetime.timezone in order to accept this. IOW you need to add testing in LOTS of places where we test datetutil/pytz
pandas/_libs/tslibs/timezones.pyx
Outdated
@@ -30,6 +32,10 @@ cdef int64_t NPY_NAT = get_nat() | |||
# ---------------------------------------------------------------------- | |||
|
|||
cdef inline bint is_utc(object tz): | |||
if PY3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would be a perf issue, you can almost certainly access this in stead at the cython level
from pandas import (DatetimeIndex, Index, date_range, DataFrame, | ||
Timestamp, offsets) | ||
|
||
from pandas.util.testing import assert_almost_equal | ||
|
||
if PY3: | ||
from datetime import timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not idiomatic, see comments above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than put this here, make a fixture in pandas/conftest.py
called datetime_tz_utc to return this and skip if its not PY3
700f83c
to
1df622b
Compare
I updated the fix and found the real issue (edited in my description above). |
if box: | ||
dt = Timestamp(dt) | ||
result[i] = dt | ||
# Python datetime objects do not support nanosecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use tz_convert_tzlocal_to_utc here from conversion.pyx
from pandas import (DatetimeIndex, Index, date_range, DataFrame, | ||
Timestamp, offsets) | ||
|
||
from pandas.util.testing import assert_almost_equal | ||
|
||
if PY3: | ||
from datetime import timezone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than put this here, make a fixture in pandas/conftest.py
called datetime_tz_utc to return this and skip if its not PY3
@@ -258,6 +264,13 @@ def test_iteration_preserves_tz(self): | |||
assert result._repr_base == expected._repr_base | |||
assert result == expected | |||
|
|||
@pytest.mark.parametrize('tz', [None, 'UTC', "US/Central", dt_UTC, | |||
dateutil.tz.tzoffset(None, -28800)]) | |||
def test_iteration_preserves_nanoseconds(self, tz): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a similar test for pandas/tests/scalar/test_timezones.py
@@ -258,6 +264,13 @@ def test_iteration_preserves_tz(self): | |||
assert result._repr_base == expected._repr_base | |||
assert result == expected | |||
|
|||
@pytest.mark.parametrize('tz', [None, 'UTC', "US/Central", dt_UTC, | |||
dateutil.tz.tzoffset(None, -28800)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to test_timezones (in same subdir)
def test_iteration_preserves_nanoseconds(self, tz): | ||
# GH 19603 | ||
index = pd.DatetimeIndex(["2018-02-08 15:00:00.168456358"], tz=tz) | ||
assert list(index)[0] == index[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check both getitem and iteration
1df622b
to
61cc22b
Compare
Replaced fix with the use of |
Fix the issue within fix_offsets add addtional dateutil test Address_edits
61cc22b
to
14b8120
Compare
All green |
thanks @mroeschke FYI looking to move tz to be fixtures (e.g. the strings 'UTC', 'Asia/Tokyo') etc....across the test codebase, IOW to have a 'tz' fixture (prob need several slightly different ones) |
git diff upstream/master -u -- "*.py" | flake8 --diff
This issue was actually very specific to localizing aDatetimeIndex
withdatetime.timezone.utc
I realize our timezone support relies heavily on pytz and dateutil, but I am curious how much support we have fordatetime.timezone
objects.So the actual issue was that
datetime.timezone.utc
is (rightly) considered a fixed offset, and the code path was calculating the new value with Pythondatetimes
andtimedeltas
which doesn't yet support nanosecond resolution.