Skip to content

BUG: to_datetime with infer_datetime_format dropped timezone names #33133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 31, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,7 @@ Timedelta
Timezones
^^^^^^^^^

-
- Bug in :func:`to_datetime` with ``infer_datetime_format=True`` where timezone names (e.g. ``UTC``) would not be parsed correctly (:issue:`33133`)
-


Expand Down
1 change: 1 addition & 0 deletions pandas/_libs/tslibs/parsing.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -805,6 +805,7 @@ def _guess_datetime_format(dt_str, dayfirst=False, dt_str_parse=du_parse,
(('second',), '%S', 2),
(('microsecond',), '%f', 6),
(('second', 'microsecond'), '%S.%f', 0),
(('tzinfo',), '%Z', 0),
]

if dayfirst:
Expand Down
6 changes: 3 additions & 3 deletions pandas/core/tools/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -606,9 +606,9 @@ def to_datetime(
would calculate the number of milliseconds to the unix epoch start.
infer_datetime_format : bool, default False
If True and no `format` is given, attempt to infer the format of the
datetime strings, and if it can be inferred, switch to a faster
method of parsing them. In some cases this can increase the parsing
speed by ~5-10x.
datetime strings based on the first non-NaN element,
and if it can be inferred, switch to a faster method of parsing them.
In some cases this can increase the parsing speed by ~5-10x.
origin : scalar, default 'unix'
Define the reference date. The numeric values would be parsed as number
of units (defined by `unit`) since this reference date.
Expand Down
12 changes: 12 additions & 0 deletions pandas/tests/tools/test_to_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -1862,6 +1862,18 @@ def test_to_datetime_infer_datetime_format_series_start_with_nans(self, cache):
pd.to_datetime(s, infer_datetime_format=True, cache=cache),
)

@pytest.mark.parametrize(
"tz_name, offset", [("UTC", 0), ("UTC-3", 180), ("UTC+3", -180)]
)
def test_infer_datetime_format_tz_name(self, tz_name, offset):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH ref?

# GH 33133
s = pd.Series([f"2019-02-02 08:07:13 {tz_name}"])
result = to_datetime(s, infer_datetime_format=True)
expected = pd.Series(
[pd.Timestamp("2019-02-02 08:07:13").tz_localize(pytz.FixedOffset(offset))]
)
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("cache", [True, False])
def test_to_datetime_iso8601_noleading_0s(self, cache):
# GH 11871
Expand Down