Skip to content

Commit 2b22544

Browse files
author
Sylvain MARIE
committed
Changed as per code review
1 parent 5310779 commit 2b22544

File tree

1 file changed

+27
-23
lines changed

1 file changed

+27
-23
lines changed

pandas/core/tools/datetimes.py

+27-23
Original file line numberDiff line numberDiff line change
@@ -733,15 +733,14 @@ def to_datetime(
733733
734734
- If ``True``, the function *always* returns a timezone-aware
735735
UTC-localized Timestamp, Series or DatetimeIndex. To do this,
736-
timezone-naive inputs are *localized* as UTC (e.g.
737-
``01-01-2020 01:00:00`` becomes ``01-01-2020 01:00:00Z``), while
738-
timezone-aware inputs are *converted* to UTC (e.g.
739-
``01-01-2020 01:00:00+0100`` becomes ``01-01-2020 00:00:00Z``).
736+
timezone-naive inputs are *localized* as UTC, while
737+
timezone-aware inputs are *converted* to UTC.
740738
741-
- If ``False`` (default), the result is a "best effort automation",
742-
with some limitations - in particular for timezones with daylight
743-
savings. See :ref:`Examples <to_datetime_tz_examples>` section for
744-
details.
739+
- If ``False`` (default), inputs will not be coerced to UTC.
740+
Timezone-naive inputs will remain naive, while timezone-aware ones
741+
will keep their time offsets. Limitations exist for mixed
742+
offsets (typically, daylight savings), see :ref:`Examples
743+
<to_datetime_tz_examples>` section for details.
745744
746745
See also: pandas general documentation about `timezone conversion and
747746
localization
@@ -793,7 +792,7 @@ def to_datetime(
793792
datetime
794793
If parsing succeeded.
795794
Return type depends on input (types in parenthesis correspond to
796-
timezone or out-of-range timestamp handling issues):
795+
fallback in case of timezone issues or out-of-range timestamps):
797796
798797
- scalar: Timestamp (or datetime.datetime)
799798
- array-like: DatetimeIndex (or Series with object dtype containing
@@ -865,7 +864,7 @@ def to_datetime(
865864
Examples
866865
--------
867866
868-
**a. Handling various input formats**
867+
**Handling various input formats**
869868
870869
Assembling a datetime from multiple columns of a DataFrame. The keys can be
871870
common abbreviations like ['year', 'month', 'day', 'minute', 'second',
@@ -914,7 +913,7 @@ def to_datetime(
914913
DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'],
915914
dtype='datetime64[ns]', freq=None)
916915
917-
**b. Non-convertible date/times**
916+
**Non-convertible date/times**
918917
919918
If a date does not meet the `timestamp limitations
920919
<https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
@@ -931,9 +930,9 @@ def to_datetime(
931930
932931
.. _to_datetime_tz_examples:
933932
934-
**c. Timezones and time offsets**
933+
**Timezones and time offsets**
935934
936-
The default behaviour (``utc=False``) might be confusing concerning timezones:
935+
The default behaviour (``utc=False``) is as follows:
937936
938937
- Timezone-naive inputs are converted to timezone-naive ``DatetimeIndex``:
939938
@@ -958,19 +957,14 @@ def to_datetime(
958957
dtype='object')
959958
960959
- A mix of timezone-aware and timezone-naive inputs is converted to
961-
a timezone-aware ``DatetimeIndex`` but only if the timezone-naive
962-
elements are ``datetime.datetime``...
960+
a timezone-aware ``DatetimeIndex`` if the offsets of the timezone-aware
961+
are constant:
963962
964963
>>> from datetime import datetime
965964
>>> pd.to_datetime(["2020-01-01 01:00 -01:00", datetime(2020, 1, 1, 3, 0)])
966965
DatetimeIndex(['2020-01-01 01:00:00-01:00', '2020-01-01 02:00:00-01:00'],
967966
dtype='datetime64[ns, pytz.FixedOffset(-60)]', freq=None)
968967
969-
- ...and not if the timezone-naive elements are strings
970-
971-
>>> pd.to_datetime(["2020-01-01 01:00 -01:00", "2020-01-01 03:00"])
972-
Index([2020-01-01 01:00:00-01:00, 2020-01-01 03:00:00], dtype='object')
973-
974968
- Finally, mixing timezone-aware strings and ``datetime.datetime`` always
975969
raises an error, even if the elements all have the same time offset.
976970
@@ -984,15 +978,25 @@ def to_datetime(
984978
985979
|
986980
987-
Setting ``utc=True`` solves most of the above issues, as timezone-naive
988-
elements will be localized to UTC, while timezone-aware ones will simply be
989-
converted to UTC (exact same datetime, but represented differently):
981+
Setting ``utc=True`` solves most of the above issues:
982+
983+
- Timezone-naive inputs are *localized* as UTC
984+
985+
>>> pd.to_datetime(['2018-10-26 12:00', '2018-10-26 13:00'], utc=True)
986+
DatetimeIndex(['2018-10-26 12:00:00+00:00', '2018-10-26 13:00:00+00:00'],
987+
dtype='datetime64[ns, UTC]', freq=None)
988+
989+
- Timezone-aware inputs are *converted* to UTC (the output represents the
990+
exact same datetime, but viewed from the UTC time offset `+00:00`).
990991
991992
>>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'],
992993
... utc=True)
993994
DatetimeIndex(['2018-10-26 17:30:00+00:00', '2018-10-26 17:00:00+00:00'],
994995
dtype='datetime64[ns, UTC]', freq=None)
995996
997+
- Inputs can contain both naive and aware, string or datetime, the above
998+
rules still apply
999+
9961000
>>> pd.to_datetime(['2018-10-26 12:00', '2018-10-26 12:00 -0530',
9971001
... datetime(2020, 1, 1, 18),
9981002
... datetime(2020, 1, 1, 18,

0 commit comments

Comments
 (0)