Skip to content

Commit c96d990

Browse files
mroeschkePingviinituutti
authored andcommitted
DOC/CLN: Timezone section in timeseries.rst (pandas-dev#24825)
* DOC: Improve timezone documentation in timeseries.rst * edit some of the examples * Address review
1 parent cb94616 commit c96d990

File tree

1 file changed

+83
-121
lines changed

1 file changed

+83
-121
lines changed

doc/source/user_guide/timeseries.rst

+83-121
Original file line numberDiff line numberDiff line change
@@ -2129,11 +2129,13 @@ These can easily be converted to a ``PeriodIndex``:
21292129
Time Zone Handling
21302130
------------------
21312131

2132-
Pandas provides rich support for working with timestamps in different time
2133-
zones using ``pytz`` and ``dateutil`` libraries. ``dateutil`` currently is only
2134-
supported for fixed offset and tzfile zones. The default library is ``pytz``.
2135-
Support for ``dateutil`` is provided for compatibility with other
2136-
applications e.g. if you use ``dateutil`` in other Python packages.
2132+
pandas provides rich support for working with timestamps in different time
2133+
zones using the ``pytz`` and ``dateutil`` libraries.
2134+
2135+
.. note::
2136+
2137+
pandas does not yet support ``datetime.timezone`` objects from the standard
2138+
library.
21372139

21382140
Working with Time Zones
21392141
~~~~~~~~~~~~~~~~~~~~~~~
@@ -2145,94 +2147,87 @@ By default, pandas objects are time zone unaware:
21452147
rng = pd.date_range('3/6/2012 00:00', periods=15, freq='D')
21462148
rng.tz is None
21472149
2148-
To supply the time zone, you can use the ``tz`` keyword to ``date_range`` and
2149-
other functions. Dateutil time zone strings are distinguished from ``pytz``
2150-
time zones by starting with ``dateutil/``.
2150+
To localize these dates to a time zone (assign a particular time zone to a naive date),
2151+
you can use the ``tz_localize`` method or the ``tz`` keyword argument in
2152+
:func:`date_range`, :class:`Timestamp`, or :class:`DatetimeIndex`.
2153+
You can either pass ``pytz`` or ``dateutil`` time zone objects or Olson time zone database strings.
2154+
Olson time zone strings will return ``pytz`` time zone objects by default.
2155+
To return ``dateutil`` time zone objects, append ``dateutil/`` before the string.
21512156

21522157
* In ``pytz`` you can find a list of common (and less common) time zones using
21532158
``from pytz import common_timezones, all_timezones``.
2154-
* ``dateutil`` uses the OS timezones so there isn't a fixed list available. For
2159+
* ``dateutil`` uses the OS time zones so there isn't a fixed list available. For
21552160
common zones, the names are the same as ``pytz``.
21562161

21572162
.. ipython:: python
21582163
21592164
import dateutil
21602165
21612166
# pytz
2162-
rng_pytz = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
2167+
rng_pytz = pd.date_range('3/6/2012 00:00', periods=3, freq='D',
21632168
tz='Europe/London')
21642169
rng_pytz.tz
21652170
21662171
# dateutil
2167-
rng_dateutil = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
2168-
tz='dateutil/Europe/London')
2172+
rng_dateutil = pd.date_range('3/6/2012 00:00', periods=3, freq='D')
2173+
rng_dateutil = rng_dateutil.tz_localize('dateutil/Europe/London')
21692174
rng_dateutil.tz
21702175
21712176
# dateutil - utc special case
2172-
rng_utc = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
2177+
rng_utc = pd.date_range('3/6/2012 00:00', periods=3, freq='D',
21732178
tz=dateutil.tz.tzutc())
21742179
rng_utc.tz
21752180
2176-
Note that the ``UTC`` timezone is a special case in ``dateutil`` and should be constructed explicitly
2177-
as an instance of ``dateutil.tz.tzutc``. You can also construct other timezones explicitly first,
2178-
which gives you more control over which time zone is used:
2181+
Note that the ``UTC`` time zone is a special case in ``dateutil`` and should be constructed explicitly
2182+
as an instance of ``dateutil.tz.tzutc``. You can also construct other time
2183+
zones objects explicitly first.
21792184

21802185
.. ipython:: python
21812186
21822187
import pytz
21832188
21842189
# pytz
21852190
tz_pytz = pytz.timezone('Europe/London')
2186-
rng_pytz = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
2187-
tz=tz_pytz)
2191+
rng_pytz = pd.date_range('3/6/2012 00:00', periods=3, freq='D')
2192+
rng_pytz = rng_pytz.tz_localize(tz_pytz)
21882193
rng_pytz.tz == tz_pytz
21892194
21902195
# dateutil
21912196
tz_dateutil = dateutil.tz.gettz('Europe/London')
2192-
rng_dateutil = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
2197+
rng_dateutil = pd.date_range('3/6/2012 00:00', periods=3, freq='D',
21932198
tz=tz_dateutil)
21942199
rng_dateutil.tz == tz_dateutil
21952200
2196-
Timestamps, like Python's ``datetime.datetime`` object can be either time zone
2197-
naive or time zone aware. Naive time series and ``DatetimeIndex`` objects can be
2198-
*localized* using ``tz_localize``:
2199-
2200-
.. ipython:: python
2201-
2202-
ts = pd.Series(np.random.randn(len(rng)), rng)
2203-
2204-
ts_utc = ts.tz_localize('UTC')
2205-
ts_utc
2206-
2207-
Again, you can explicitly construct the timezone object first.
2208-
You can use the ``tz_convert`` method to convert pandas objects to convert
2209-
tz-aware data to another time zone:
2201+
To convert a time zone aware pandas object from one time zone to another,
2202+
you can use the ``tz_convert`` method.
22102203

22112204
.. ipython:: python
22122205
2213-
ts_utc.tz_convert('US/Eastern')
2206+
rng_pytz.tz_convert('US/Eastern')
22142207
22152208
.. warning::
22162209

2217-
Be wary of conversions between libraries. For some zones ``pytz`` and ``dateutil`` have different
2218-
definitions of the zone. This is more of a problem for unusual timezones than for
2210+
Be wary of conversions between libraries. For some time zones, ``pytz`` and ``dateutil`` have different
2211+
definitions of the zone. This is more of a problem for unusual time zones than for
22192212
'standard' zones like ``US/Eastern``.
22202213

22212214
.. warning::
22222215

2223-
Be aware that a timezone definition across versions of timezone libraries may not
2224-
be considered equal. This may cause problems when working with stored data that
2225-
is localized using one version and operated on with a different version.
2226-
See :ref:`here<io.hdf5-notes>` for how to handle such a situation.
2216+
Be aware that a time zone definition across versions of time zone libraries may not
2217+
be considered equal. This may cause problems when working with stored data that
2218+
is localized using one version and operated on with a different version.
2219+
See :ref:`here<io.hdf5-notes>` for how to handle such a situation.
22272220

22282221
.. warning::
22292222

2230-
It is incorrect to pass a timezone directly into the ``datetime.datetime`` constructor (e.g.,
2231-
``datetime.datetime(2011, 1, 1, tz=timezone('US/Eastern'))``. Instead, the datetime
2232-
needs to be localized using the localize method on the timezone.
2223+
For ``pytz`` time zones, it is incorrect to pass a time zone object directly into
2224+
the ``datetime.datetime`` constructor
2225+
(e.g., ``datetime.datetime(2011, 1, 1, tz=pytz.timezone('US/Eastern'))``.
2226+
Instead, the datetime needs to be localized using the ``localize`` method
2227+
on the ``pytz`` time zone object.
22332228

2234-
Under the hood, all timestamps are stored in UTC. Scalar values from a
2235-
``DatetimeIndex`` with a time zone will have their fields (day, hour, minute)
2229+
Under the hood, all timestamps are stored in UTC. Values from a time zone aware
2230+
:class:`DatetimeIndex` or :class:`Timestamp` will have their fields (day, hour, minute, etc.)
22362231
localized to the time zone. However, timestamps with the same UTC value are
22372232
still considered to be equal even if they are in different time zones:
22382233

@@ -2241,114 +2236,78 @@ still considered to be equal even if they are in different time zones:
22412236
rng_eastern = rng_utc.tz_convert('US/Eastern')
22422237
rng_berlin = rng_utc.tz_convert('Europe/Berlin')
22432238
2244-
rng_eastern[5]
2245-
rng_berlin[5]
2246-
rng_eastern[5] == rng_berlin[5]
2247-
2248-
Like ``Series``, ``DataFrame``, and ``DatetimeIndex``; ``Timestamp`` objects
2249-
can be converted to other time zones using ``tz_convert``:
2250-
2251-
.. ipython:: python
2252-
2253-
rng_eastern[5]
2254-
rng_berlin[5]
2255-
rng_eastern[5].tz_convert('Europe/Berlin')
2256-
2257-
Localization of ``Timestamp`` functions just like ``DatetimeIndex`` and ``Series``:
2258-
2259-
.. ipython:: python
2260-
2261-
rng[5]
2262-
rng[5].tz_localize('Asia/Shanghai')
2263-
2239+
rng_eastern[2]
2240+
rng_berlin[2]
2241+
rng_eastern[2] == rng_berlin[2]
22642242
2265-
Operations between ``Series`` in different time zones will yield UTC
2266-
``Series``, aligning the data on the UTC timestamps:
2243+
Operations between :class:`Series` in different time zones will yield UTC
2244+
:class:`Series`, aligning the data on the UTC timestamps:
22672245

22682246
.. ipython:: python
22692247
2248+
ts_utc = pd.Series(range(3), pd.date_range('20130101', periods=3, tz='UTC'))
22702249
eastern = ts_utc.tz_convert('US/Eastern')
22712250
berlin = ts_utc.tz_convert('Europe/Berlin')
22722251
result = eastern + berlin
22732252
result
22742253
result.index
22752254
2276-
To remove timezone from tz-aware ``DatetimeIndex``, use ``tz_localize(None)`` or ``tz_convert(None)``.
2277-
``tz_localize(None)`` will remove timezone holding local time representations.
2278-
``tz_convert(None)`` will remove timezone after converting to UTC time.
2255+
To remove time zone information, use ``tz_localize(None)`` or ``tz_convert(None)``.
2256+
``tz_localize(None)`` will remove the time zone yielding the local time representation.
2257+
``tz_convert(None)`` will remove the time zone after converting to UTC time.
22792258

22802259
.. ipython:: python
22812260
22822261
didx = pd.date_range(start='2014-08-01 09:00', freq='H',
2283-
periods=10, tz='US/Eastern')
2262+
periods=3, tz='US/Eastern')
22842263
didx
22852264
didx.tz_localize(None)
22862265
didx.tz_convert(None)
22872266
2288-
# tz_convert(None) is identical with tz_convert('UTC').tz_localize(None)
2267+
# tz_convert(None) is identical to tz_convert('UTC').tz_localize(None)
22892268
didx.tz_convert('UTC').tz_localize(None)
22902269
22912270
.. _timeseries.timezone_ambiguous:
22922271

22932272
Ambiguous Times when Localizing
22942273
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22952274

2296-
In some cases, localize cannot determine the DST and non-DST hours when there are
2297-
duplicates. This often happens when reading files or database records that simply
2298-
duplicate the hours. Passing ``ambiguous='infer'`` into ``tz_localize`` will
2299-
attempt to determine the right offset. Below the top example will fail as it
2300-
contains ambiguous times and the bottom will infer the right offset.
2275+
``tz_localize`` may not be able to determine the UTC offset of a timestamp
2276+
because daylight savings time (DST) in a local time zone causes some times to occur
2277+
twice within one day ("clocks fall back"). The following options are available:
2278+
2279+
* ``'raise'``: Raises a ``pytz.AmbiguousTimeError`` (the default behavior)
2280+
* ``'infer'``: Attempt to determine the correct offset base on the monotonicity of the timestamps
2281+
* ``'NaT'``: Replaces ambiguous times with ``NaT``
2282+
* ``bool``: ``True`` represents a DST time, ``False`` represents non-DST time. An array-like of ``bool`` values is supported for a sequence of times.
23012283

23022284
.. ipython:: python
23032285
23042286
rng_hourly = pd.DatetimeIndex(['11/06/2011 00:00', '11/06/2011 01:00',
2305-
'11/06/2011 01:00', '11/06/2011 02:00',
2306-
'11/06/2011 03:00'])
2287+
'11/06/2011 01:00', '11/06/2011 02:00'])
23072288
2308-
This will fail as there are ambiguous times
2289+
This will fail as there are ambiguous times (``'11/06/2011 01:00'``)
23092290

23102291
.. code-block:: ipython
23112292
23122293
In [2]: rng_hourly.tz_localize('US/Eastern')
23132294
AmbiguousTimeError: Cannot infer dst time from Timestamp('2011-11-06 01:00:00'), try using the 'ambiguous' argument
23142295
2315-
Infer the ambiguous times
2316-
2317-
.. ipython:: python
2318-
2319-
rng_hourly_eastern = rng_hourly.tz_localize('US/Eastern', ambiguous='infer')
2320-
rng_hourly_eastern.to_list()
2321-
2322-
In addition to 'infer', there are several other arguments supported. Passing
2323-
an array-like of bools or 0s/1s where True represents a DST hour and False a
2324-
non-DST hour, allows for distinguishing more than one DST
2325-
transition (e.g., if you have multiple records in a database each with their
2326-
own DST transition). Or passing 'NaT' will fill in transition times
2327-
with not-a-time values. These methods are available in the ``DatetimeIndex``
2328-
constructor as well as ``tz_localize``.
2296+
Handle these ambiguous times by specifying the following.
23292297

23302298
.. ipython:: python
23312299
2332-
rng_hourly_dst = np.array([1, 1, 0, 0, 0])
2333-
rng_hourly.tz_localize('US/Eastern', ambiguous=rng_hourly_dst).to_list()
2334-
rng_hourly.tz_localize('US/Eastern', ambiguous='NaT').to_list()
2335-
2336-
didx = pd.date_range(start='2014-08-01 09:00', freq='H',
2337-
periods=10, tz='US/Eastern')
2338-
didx
2339-
didx.tz_localize(None)
2340-
didx.tz_convert(None)
2341-
2342-
# tz_convert(None) is identical with tz_convert('UTC').tz_localize(None)
2343-
didx.tz_convert('UCT').tz_localize(None)
2300+
rng_hourly.tz_localize('US/Eastern', ambiguous='infer')
2301+
rng_hourly.tz_localize('US/Eastern', ambiguous='NaT')
2302+
rng_hourly.tz_localize('US/Eastern', ambiguous=[True, True, False, False])
23442303
23452304
.. _timeseries.timezone_nonexistent:
23462305

23472306
Nonexistent Times when Localizing
23482307
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
23492308

23502309
A DST transition may also shift the local time ahead by 1 hour creating nonexistent
2351-
local times. The behavior of localizing a timeseries with nonexistent times
2310+
local times ("clocks spring forward"). The behavior of localizing a timeseries with nonexistent times
23522311
can be controlled by the ``nonexistent`` argument. The following options are available:
23532312

23542313
* ``'raise'``: Raises a ``pytz.NonExistentTimeError`` (the default behavior)
@@ -2382,58 +2341,61 @@ Transform nonexistent times to ``NaT`` or shift the times.
23822341
23832342
.. _timeseries.timezone_series:
23842343

2385-
TZ Aware Dtypes
2386-
~~~~~~~~~~~~~~~
2344+
Time Zone Series Operations
2345+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
23872346

2388-
``Series/DatetimeIndex`` with a timezone **naive** value are represented with a dtype of ``datetime64[ns]``.
2347+
A :class:`Series` with time zone **naive** values is
2348+
represented with a dtype of ``datetime64[ns]``.
23892349

23902350
.. ipython:: python
23912351
23922352
s_naive = pd.Series(pd.date_range('20130101', periods=3))
23932353
s_naive
23942354
2395-
``Series/DatetimeIndex`` with a timezone **aware** value are represented with a dtype of ``datetime64[ns, tz]``.
2355+
A :class:`Series` with a time zone **aware** values is
2356+
represented with a dtype of ``datetime64[ns, tz]`` where ``tz`` is the time zone
23962357

23972358
.. ipython:: python
23982359
23992360
s_aware = pd.Series(pd.date_range('20130101', periods=3, tz='US/Eastern'))
24002361
s_aware
24012362
2402-
Both of these ``Series`` can be manipulated via the ``.dt`` accessor, see :ref:`here <basics.dt_accessors>`.
2363+
Both of these :class:`Series` time zone information
2364+
can be manipulated via the ``.dt`` accessor, see :ref:`the dt accessor section <basics.dt_accessors>`.
24032365

2404-
For example, to localize and convert a naive stamp to timezone aware.
2366+
For example, to localize and convert a naive stamp to time zone aware.
24052367

24062368
.. ipython:: python
24072369
24082370
s_naive.dt.tz_localize('UTC').dt.tz_convert('US/Eastern')
24092371
2410-
2411-
Further more you can ``.astype(...)`` timezone aware (and naive). This operation is effectively a localize AND convert on a naive stamp, and
2412-
a convert on an aware stamp.
2372+
Time zone information can also be manipulated using the ``astype`` method.
2373+
This method can localize and convert time zone naive timestamps or
2374+
convert time zone aware timestamps.
24132375

24142376
.. ipython:: python
24152377
2416-
# localize and convert a naive timezone
2378+
# localize and convert a naive time zone
24172379
s_naive.astype('datetime64[ns, US/Eastern]')
24182380
24192381
# make an aware tz naive
24202382
s_aware.astype('datetime64[ns]')
24212383
2422-
# convert to a new timezone
2384+
# convert to a new time zone
24232385
s_aware.astype('datetime64[ns, CET]')
24242386
24252387
.. note::
24262388

24272389
Using :meth:`Series.to_numpy` on a ``Series``, returns a NumPy array of the data.
2428-
NumPy does not currently support timezones (even though it is *printing* in the local timezone!),
2429-
therefore an object array of Timestamps is returned for timezone aware data:
2390+
NumPy does not currently support time zones (even though it is *printing* in the local time zone!),
2391+
therefore an object array of Timestamps is returned for time zone aware data:
24302392

24312393
.. ipython:: python
24322394
24332395
s_naive.to_numpy()
24342396
s_aware.to_numpy()
24352397
2436-
By converting to an object array of Timestamps, it preserves the timezone
2398+
By converting to an object array of Timestamps, it preserves the time zone
24372399
information. For example, when converting back to a Series:
24382400

24392401
.. ipython:: python

0 commit comments

Comments
 (0)