@@ -733,15 +733,14 @@ def to_datetime(
733
733
734
734
- If ``True``, the function *always* returns a timezone-aware
735
735
UTC-localized Timestamp, Series or DatetimeIndex. To do this,
736
- timezone-naive inputs are *localized* as UTC (e.g.
737
- ``01-01-2020 01:00:00`` becomes ``01-01-2020 01:00:00Z``), while
738
- timezone-aware inputs are *converted* to UTC (e.g.
739
- ``01-01-2020 01:00:00+0100`` becomes ``01-01-2020 00:00:00Z``).
736
+ timezone-naive inputs are *localized* as UTC, while
737
+ timezone-aware inputs are *converted* to UTC.
740
738
741
- - If ``False`` (default), the result is a "best effort automation",
742
- with some limitations - in particular for timezones with daylight
743
- savings. See :ref:`Examples <to_datetime_tz_examples>` section for
744
- details.
739
+ - If ``False`` (default), inputs will not be coerced to UTC.
740
+ Timezone-naive inputs will remain naive, while timezone-aware ones
741
+ will keep their time offsets. Limitations exist for mixed
742
+ offsets (typically, daylight savings), see :ref:`Examples
743
+ <to_datetime_tz_examples>` section for details.
745
744
746
745
See also: pandas general documentation about `timezone conversion and
747
746
localization
@@ -793,7 +792,7 @@ def to_datetime(
793
792
datetime
794
793
If parsing succeeded.
795
794
Return type depends on input (types in parenthesis correspond to
796
- timezone or out-of-range timestamp handling issues ):
795
+ fallback in case of timezone issues or out-of-range timestamps ):
797
796
798
797
- scalar: Timestamp (or datetime.datetime)
799
798
- array-like: DatetimeIndex (or Series with object dtype containing
@@ -865,7 +864,7 @@ def to_datetime(
865
864
Examples
866
865
--------
867
866
868
- **a. Handling various input formats**
867
+ **Handling various input formats**
869
868
870
869
Assembling a datetime from multiple columns of a DataFrame. The keys can be
871
870
common abbreviations like ['year', 'month', 'day', 'minute', 'second',
@@ -914,7 +913,7 @@ def to_datetime(
914
913
DatetimeIndex(['1960-01-02', '1960-01-03', '1960-01-04'],
915
914
dtype='datetime64[ns]', freq=None)
916
915
917
- **b. Non-convertible date/times**
916
+ **Non-convertible date/times**
918
917
919
918
If a date does not meet the `timestamp limitations
920
919
<https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html
@@ -931,9 +930,9 @@ def to_datetime(
931
930
932
931
.. _to_datetime_tz_examples:
933
932
934
- **c. Timezones and time offsets**
933
+ **Timezones and time offsets**
935
934
936
- The default behaviour (``utc=False``) might be confusing concerning timezones :
935
+ The default behaviour (``utc=False``) is as follows :
937
936
938
937
- Timezone-naive inputs are converted to timezone-naive ``DatetimeIndex``:
939
938
@@ -958,19 +957,14 @@ def to_datetime(
958
957
dtype='object')
959
958
960
959
- A mix of timezone-aware and timezone-naive inputs is converted to
961
- a timezone-aware ``DatetimeIndex`` but only if the timezone-naive
962
- elements are ``datetime.datetime``...
960
+ a timezone-aware ``DatetimeIndex`` if the offsets of the timezone-aware
961
+ are constant:
963
962
964
963
>>> from datetime import datetime
965
964
>>> pd.to_datetime(["2020-01-01 01:00 -01:00", datetime(2020, 1, 1, 3, 0)])
966
965
DatetimeIndex(['2020-01-01 01:00:00-01:00', '2020-01-01 02:00:00-01:00'],
967
966
dtype='datetime64[ns, pytz.FixedOffset(-60)]', freq=None)
968
967
969
- - ...and not if the timezone-naive elements are strings
970
-
971
- >>> pd.to_datetime(["2020-01-01 01:00 -01:00", "2020-01-01 03:00"])
972
- Index([2020-01-01 01:00:00-01:00, 2020-01-01 03:00:00], dtype='object')
973
-
974
968
- Finally, mixing timezone-aware strings and ``datetime.datetime`` always
975
969
raises an error, even if the elements all have the same time offset.
976
970
@@ -984,15 +978,25 @@ def to_datetime(
984
978
985
979
|
986
980
987
- Setting ``utc=True`` solves most of the above issues, as timezone-naive
988
- elements will be localized to UTC, while timezone-aware ones will simply be
989
- converted to UTC (exact same datetime, but represented differently):
981
+ Setting ``utc=True`` solves most of the above issues:
982
+
983
+ - Timezone-naive inputs are *localized* as UTC
984
+
985
+ >>> pd.to_datetime(['2018-10-26 12:00', '2018-10-26 13:00'], utc=True)
986
+ DatetimeIndex(['2018-10-26 12:00:00+00:00', '2018-10-26 13:00:00+00:00'],
987
+ dtype='datetime64[ns, UTC]', freq=None)
988
+
989
+ - Timezone-aware inputs are *converted* to UTC (the output represents the
990
+ exact same datetime, but viewed from the UTC time offset `+00:00`).
990
991
991
992
>>> pd.to_datetime(['2018-10-26 12:00 -0530', '2018-10-26 12:00 -0500'],
992
993
... utc=True)
993
994
DatetimeIndex(['2018-10-26 17:30:00+00:00', '2018-10-26 17:00:00+00:00'],
994
995
dtype='datetime64[ns, UTC]', freq=None)
995
996
997
+ - Inputs can contain both naive and aware, string or datetime, the above
998
+ rules still apply
999
+
996
1000
>>> pd.to_datetime(['2018-10-26 12:00', '2018-10-26 12:00 -0530',
997
1001
... datetime(2020, 1, 1, 18),
998
1002
... datetime(2020, 1, 1, 18,
0 commit comments