Skip to content

Commit 62f4409

Browse files
bartaeltermanPingviinituutti
authored andcommitted
DOC: Clarify documentation of 'ambiguous' parameter (pandas-dev#23408)
* Add documentation line with example for the ambiguous parameter of tz_locaclize * Updating 'ambiguous'-param doc + update it on Timestamp, DatetimeIndex and NaT This is following the discussion at pandas-dev#23408 (comment)
1 parent 2eca0dc commit 62f4409

File tree

6 files changed

+128
-1
lines changed

6 files changed

+128
-1
lines changed

pandas/_libs/tslibs/conversion.pyx

+14-1
Original file line numberDiff line numberDiff line change
@@ -844,7 +844,20 @@ def tz_localize_to_utc(ndarray[int64_t] vals, object tz, object ambiguous=None,
844844
vals : ndarray[int64_t]
845845
tz : tzinfo or None
846846
ambiguous : str, bool, or arraylike
847-
If arraylike, must have the same length as vals
847+
When clocks moved backward due to DST, ambiguous times may arise.
848+
For example in Central European Time (UTC+01), when going from 03:00
849+
DST to 02:00 non-DST, 02:30:00 local time occurs both at 00:30:00 UTC
850+
and at 01:30:00 UTC. In such a situation, the `ambiguous` parameter
851+
dictates how ambiguous times should be handled.
852+
853+
- 'infer' will attempt to infer fall dst-transition hours based on
854+
order
855+
- bool-ndarray where True signifies a DST time, False signifies a
856+
non-DST time (note that this flag is only applicable for ambiguous
857+
times, but the array must have the same length as vals)
858+
- bool if True, treat all vals as DST. If False, treat them as non-DST
859+
- 'NaT' will return NaT where there are ambiguous times
860+
848861
nonexistent : str
849862
If arraylike, must have the same length as vals
850863

pandas/_libs/tslibs/nattype.pyx

+7
Original file line numberDiff line numberDiff line change
@@ -593,6 +593,13 @@ class NaTType(_NaT):
593593
None will remove timezone holding local time.
594594
595595
ambiguous : bool, 'NaT', default 'raise'
596+
When clocks moved backward due to DST, ambiguous times may arise.
597+
For example in Central European Time (UTC+01), when going from
598+
03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at
599+
00:30:00 UTC and at 01:30:00 UTC. In such a situation, the
600+
`ambiguous` parameter dictates how ambiguous times should be
601+
handled.
602+
596603
- bool contains flags to determine if time is dst or not (note
597604
that this flag is only applicable for ambiguous fall dst dates)
598605
- 'NaT' will return NaT for an ambiguous time

pandas/_libs/tslibs/timestamps.pyx

+7
Original file line numberDiff line numberDiff line change
@@ -1026,6 +1026,13 @@ class Timestamp(_Timestamp):
10261026
None will remove timezone holding local time.
10271027
10281028
ambiguous : bool, 'NaT', default 'raise'
1029+
When clocks moved backward due to DST, ambiguous times may arise.
1030+
For example in Central European Time (UTC+01), when going from
1031+
03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at
1032+
00:30:00 UTC and at 01:30:00 UTC. In such a situation, the
1033+
`ambiguous` parameter dictates how ambiguous times should be
1034+
handled.
1035+
10291036
- bool contains flags to determine if time is dst or not (note
10301037
that this flag is only applicable for ambiguous fall dst dates)
10311038
- 'NaT' will return NaT for an ambiguous time

pandas/core/arrays/datetimes.py

+40
Original file line numberDiff line numberDiff line change
@@ -614,6 +614,12 @@ def tz_localize(self, tz, ambiguous='raise', nonexistent='raise',
614614
Time zone to convert timestamps to. Passing ``None`` will
615615
remove the time zone information preserving local time.
616616
ambiguous : 'infer', 'NaT', bool array, default 'raise'
617+
When clocks moved backward due to DST, ambiguous times may arise.
618+
For example in Central European Time (UTC+01), when going from
619+
03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at
620+
00:30:00 UTC and at 01:30:00 UTC. In such a situation, the
621+
`ambiguous` parameter dictates how ambiguous times should be
622+
handled.
617623
618624
- 'infer' will attempt to infer fall dst-transition hours based on
619625
order
@@ -685,6 +691,40 @@ def tz_localize(self, tz, ambiguous='raise', nonexistent='raise',
685691
DatetimeIndex(['2018-03-01 09:00:00', '2018-03-02 09:00:00',
686692
'2018-03-03 09:00:00'],
687693
dtype='datetime64[ns]', freq='D')
694+
695+
Be careful with DST changes. When there is sequential data, pandas can
696+
infer the DST time:
697+
>>> s = pd.to_datetime(pd.Series([
698+
... '2018-10-28 01:30:00',
699+
... '2018-10-28 02:00:00',
700+
... '2018-10-28 02:30:00',
701+
... '2018-10-28 02:00:00',
702+
... '2018-10-28 02:30:00',
703+
... '2018-10-28 03:00:00',
704+
... '2018-10-28 03:30:00']))
705+
>>> s.dt.tz_localize('CET', ambiguous='infer')
706+
2018-10-28 01:30:00+02:00 0
707+
2018-10-28 02:00:00+02:00 1
708+
2018-10-28 02:30:00+02:00 2
709+
2018-10-28 02:00:00+01:00 3
710+
2018-10-28 02:30:00+01:00 4
711+
2018-10-28 03:00:00+01:00 5
712+
2018-10-28 03:30:00+01:00 6
713+
dtype: int64
714+
715+
In some cases, inferring the DST is impossible. In such cases, you can
716+
pass an ndarray to the ambiguous parameter to set the DST explicitly
717+
718+
>>> s = pd.to_datetime(pd.Series([
719+
... '2018-10-28 01:20:00',
720+
... '2018-10-28 02:36:00',
721+
... '2018-10-28 03:46:00']))
722+
>>> s.dt.tz_localize('CET', ambiguous=np.array([True, True, False]))
723+
0 2018-10-28 01:20:00+02:00
724+
1 2018-10-28 02:36:00+02:00
725+
2 2018-10-28 03:46:00+01:00
726+
dtype: datetime64[ns, CET]
727+
688728
"""
689729
if errors is not None:
690730
warnings.warn("The errors argument is deprecated and will be "

pandas/core/generic.py

+53
Original file line numberDiff line numberDiff line change
@@ -8718,6 +8718,13 @@ def tz_localize(self, tz, axis=0, level=None, copy=True,
87188718
copy : boolean, default True
87198719
Also make a copy of the underlying data
87208720
ambiguous : 'infer', bool-ndarray, 'NaT', default 'raise'
8721+
When clocks moved backward due to DST, ambiguous times may arise.
8722+
For example in Central European Time (UTC+01), when going from
8723+
03:00 DST to 02:00 non-DST, 02:30:00 local time occurs both at
8724+
00:30:00 UTC and at 01:30:00 UTC. In such a situation, the
8725+
`ambiguous` parameter dictates how ambiguous times should be
8726+
handled.
8727+
87218728
- 'infer' will attempt to infer fall dst-transition hours based on
87228729
order
87238730
- bool-ndarray where True signifies a DST time, False designates
@@ -8745,6 +8752,52 @@ def tz_localize(self, tz, axis=0, level=None, copy=True,
87458752
------
87468753
TypeError
87478754
If the TimeSeries is tz-aware and tz is not None.
8755+
8756+
Examples
8757+
--------
8758+
8759+
Localize local times:
8760+
8761+
>>> s = pd.Series([1],
8762+
... index=pd.DatetimeIndex(['2018-09-15 01:30:00']))
8763+
>>> s.tz_localize('CET')
8764+
2018-09-15 01:30:00+02:00 1
8765+
dtype: int64
8766+
8767+
Be careful with DST changes. When there is sequential data, pandas
8768+
can infer the DST time:
8769+
8770+
>>> s = pd.Series(range(7), index=pd.DatetimeIndex([
8771+
... '2018-10-28 01:30:00',
8772+
... '2018-10-28 02:00:00',
8773+
... '2018-10-28 02:30:00',
8774+
... '2018-10-28 02:00:00',
8775+
... '2018-10-28 02:30:00',
8776+
... '2018-10-28 03:00:00',
8777+
... '2018-10-28 03:30:00']))
8778+
>>> s.tz_localize('CET', ambiguous='infer')
8779+
2018-10-28 01:30:00+02:00 0
8780+
2018-10-28 02:00:00+02:00 1
8781+
2018-10-28 02:30:00+02:00 2
8782+
2018-10-28 02:00:00+01:00 3
8783+
2018-10-28 02:30:00+01:00 4
8784+
2018-10-28 03:00:00+01:00 5
8785+
2018-10-28 03:30:00+01:00 6
8786+
dtype: int64
8787+
8788+
In some cases, inferring the DST is impossible. In such cases, you can
8789+
pass an ndarray to the ambiguous parameter to set the DST explicitly
8790+
8791+
>>> s = pd.Series(range(3), index=pd.DatetimeIndex([
8792+
... '2018-10-28 01:20:00',
8793+
... '2018-10-28 02:36:00',
8794+
... '2018-10-28 03:46:00']))
8795+
>>> s.tz_localize('CET', ambiguous=np.array([True, True, False]))
8796+
2018-10-28 01:20:00+02:00 0
8797+
2018-10-28 02:36:00+02:00 1
8798+
2018-10-28 03:46:00+01:00 2
8799+
dtype: int64
8800+
87488801
"""
87498802
if nonexistent not in ('raise', 'NaT', 'shift'):
87508803
raise ValueError("The nonexistent argument must be one of 'raise',"

pandas/core/indexes/datetimes.py

+7
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,12 @@ class DatetimeIndex(DatetimeArrayMixin, DatelikeOps, TimelikeOps,
9999
the 'left', 'right', or both sides (None)
100100
tz : pytz.timezone or dateutil.tz.tzfile
101101
ambiguous : 'infer', bool-ndarray, 'NaT', default 'raise'
102+
When clocks moved backward due to DST, ambiguous times may arise.
103+
For example in Central European Time (UTC+01), when going from 03:00
104+
DST to 02:00 non-DST, 02:30:00 local time occurs both at 00:30:00 UTC
105+
and at 01:30:00 UTC. In such a situation, the `ambiguous` parameter
106+
dictates how ambiguous times should be handled.
107+
102108
- 'infer' will attempt to infer fall dst-transition hours based on
103109
order
104110
- bool-ndarray where True signifies a DST time, False signifies a
@@ -173,6 +179,7 @@ class DatetimeIndex(DatetimeArrayMixin, DatelikeOps, TimelikeOps,
173179
TimedeltaIndex : Index of timedelta64 data
174180
PeriodIndex : Index of Period data
175181
pandas.to_datetime : Convert argument to datetime
182+
176183
"""
177184
_resolution = cache_readonly(DatetimeArrayMixin._resolution.fget)
178185
_shallow_copy = Index._shallow_copy

0 commit comments

Comments
 (0)