-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG: Correctly localize naive datetime strings with Series and datetimetztype (#17415) #17603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
0f3be92
735f35b
18e914b
bb10828
e66397d
cee514f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,7 +20,7 @@ | |
is_integer_dtype, | ||
is_datetime_or_timedelta_dtype, | ||
is_bool_dtype, is_scalar, | ||
_string_dtypes, | ||
is_string_dtype, _string_dtypes, | ||
pandas_dtype, | ||
_ensure_int8, _ensure_int16, | ||
_ensure_int32, _ensure_int64, | ||
|
@@ -1003,12 +1003,18 @@ def maybe_cast_to_datetime(value, dtype, errors='raise'): | |
if is_datetime64: | ||
value = to_datetime(value, errors=errors)._values | ||
elif is_datetime64tz: | ||
# input has to be UTC at this point, so just | ||
# localize | ||
value = (to_datetime(value, errors=errors) | ||
.tz_localize('UTC') | ||
.tz_convert(dtype.tz) | ||
) | ||
# This block can be simplified once PR #17413 is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add to the comment what that PR is doing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add an xref (with this PR) to that issue as well There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. change to #13712 here (the issue and not the PR which was closed) |
||
# complete | ||
is_dt_string = is_string_dtype(value) | ||
value = to_datetime(value, errors=errors) | ||
if is_dt_string: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hmm adding this incremental logic is error prone. I think this should simply be converted higher up. can you see if this can be simplied that way. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This logic is pretty high up the stack already. The Series constructor calls There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. try passing utc=True to to_datetime this logic is too specific here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. actually it might be that to_datetime(...., tz=dtype.tz) might be the correct idiom here (another PR implements this) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah #17413 will make this easier. However I think there will need to be logic to determine if the incoming data is naive or already localized to UTC. The code comments that were already here note that (numeric) data is already UTC while string data (what I am trying to fix here) may be naive. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my point is solving this here is wrong this is a bug in to_datetime itself There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. passing utc=True to to_datetime will make this work (then just converting to the dtype) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you try passing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
# Strings here are naive, so directly localize | ||
value = value.tz_localize(dtype.tz) | ||
else: | ||
# Numeric values are UTC at this point, | ||
# so localize and convert | ||
value = (value.tz_localize('UTC') | ||
.tz_convert(dtype.tz)) | ||
elif is_timedelta64: | ||
value = to_timedelta(value, errors=errors)._values | ||
except (AttributeError, ValueError, TypeError): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datetime64[ns, tz]