|
21 | 21 | Time Series / Date functionality
|
22 | 22 | ********************************
|
23 | 23 |
|
24 |
| -pandas has proven very successful as a tool for working with time series data, |
25 |
| -especially in the financial data analysis space. Using the NumPy ``datetime64`` and ``timedelta64`` dtypes, |
26 |
| -we have consolidated a large number of features from other Python libraries like ``scikits.timeseries`` as well as created |
| 24 | +pandas contains extensive capabilities and features for working with time series data for all domains. |
| 25 | +Using the NumPy ``datetime64`` and ``timedelta64`` dtypes, pandas has consolidated a large number of |
| 26 | +features from other Python libraries like ``scikits.timeseries`` as well as created |
27 | 27 | a tremendous amount of new functionality for manipulating time series data.
|
28 | 28 |
|
29 |
| -In working with time series data, we will frequently seek to: |
| 29 | +For example, pandas supports: |
30 | 30 |
|
31 |
| -* generate sequences of fixed-frequency dates and time spans |
32 |
| -* conform or convert time series to a particular frequency |
33 |
| -* compute "relative" dates based on various non-standard time increments |
34 |
| - (e.g. 5 business days before the last business day of the year), or "roll" |
35 |
| - dates forward or backward |
| 31 | +Parsing time series information from various sources and formats |
36 | 32 |
|
37 |
| -pandas provides a relatively compact and self-contained set of tools for |
38 |
| -performing the above tasks. |
| 33 | +.. ipython:: python |
| 34 | +
|
| 35 | + dti = pd.to_datetime(['1/1/2018', np.datetime64('2018-01-01'), datetime(2018, 1, 1)]) |
| 36 | + dti |
39 | 37 |
|
40 |
| -Create a range of dates: |
| 38 | +Generate sequences of fixed-frequency dates and time spans |
41 | 39 |
|
42 | 40 | .. ipython:: python
|
43 | 41 |
|
44 |
| - # 72 hours starting with midnight Jan 1st, 2011 |
45 |
| - rng = pd.date_range('1/1/2011', periods=72, freq='H') |
46 |
| - rng[:5] |
| 42 | + dti = pd.date_range('2018-01-01', periods=3, freq='H') |
| 43 | + dti |
47 | 44 |
|
48 |
| -Index pandas objects with dates: |
| 45 | +Manipulating and converting date times with timezone information |
49 | 46 |
|
50 | 47 | .. ipython:: python
|
51 | 48 |
|
52 |
| - ts = pd.Series(np.random.randn(len(rng)), index=rng) |
53 |
| - ts.head() |
| 49 | + dti = dti.tz_localize('UTC') |
| 50 | + dti |
| 51 | + dti.tz_convert('US/Pacific') |
54 | 52 |
|
55 |
| -Change frequency and fill gaps: |
| 53 | +Resampling or converting a time series to a particular frequency |
56 | 54 |
|
57 | 55 | .. ipython:: python
|
58 | 56 |
|
59 |
| - # to 45 minute frequency and forward fill |
60 |
| - converted = ts.asfreq('45Min', method='pad') |
61 |
| - converted.head() |
| 57 | + idx = pd.date_range('2018-01-01', periods=5, freq='H') |
| 58 | + ts = pd.Series(range(len(idx)), index=idx) |
| 59 | + ts |
| 60 | + ts.resample('2H').mean() |
62 | 61 |
|
63 |
| -Resample the series to a daily frequency: |
| 62 | +Performing date and time arithmetic with absolute or relative time increments |
64 | 63 |
|
65 | 64 | .. ipython:: python
|
66 | 65 |
|
67 |
| - # Daily means |
68 |
| - ts.resample('D').mean() |
| 66 | + friday = pd.Timestamp('2018-01-05') |
| 67 | + friday.day_name() |
| 68 | + # Add 1 day |
| 69 | + saturday = friday + pd.Timedelta('1 day') |
| 70 | + saturday.day_name() |
| 71 | + # Add 1 business day (Friday --> Monday) |
| 72 | + monday = friday + pd.tseries.offsets.BDay() |
| 73 | + monday.day_name() |
| 74 | +
|
| 75 | +pandas provides a relatively compact and self-contained set of tools for |
| 76 | +performing the above tasks and more. |
69 | 77 |
|
70 | 78 |
|
71 | 79 | .. _timeseries.overview:
|
72 | 80 |
|
73 | 81 | Overview
|
74 | 82 | --------
|
75 | 83 |
|
76 |
| -The following table shows the type of time-related classes pandas can handle and |
77 |
| -how to create them. |
| 84 | +pandas captures 4 general time related concepts: |
| 85 | + |
| 86 | +#. Date times: A specific date and time with timezone support. Similar to ``datetime.datetime`` from the standard library. |
| 87 | +#. Time deltas: An absolute time duration. Similar to ``datetime.timedelta`` from the standard library. |
| 88 | +#. Time spans: A span of time defined by a point in time and its associated frequency. |
| 89 | +#. Date offsets: A relative time duration that respects calendar arithmetic. Similar to ``dateutil.relativedelta.relativedelta`` from the ``dateutil`` package. |
78 | 90 |
|
79 |
| -================= =============================== =================================================================== |
80 |
| -Class Remarks How to create |
81 |
| -================= =============================== =================================================================== |
82 |
| -``Timestamp`` Represents a single timestamp ``to_datetime``, ``Timestamp`` |
83 |
| -``DatetimeIndex`` Index of ``Timestamp`` ``to_datetime``, ``date_range``, ``bdate_range``, ``DatetimeIndex`` |
84 |
| -``Period`` Represents a single time span ``Period`` |
85 |
| -``PeriodIndex`` Index of ``Period`` ``period_range``, ``PeriodIndex`` |
86 |
| -================= =============================== =================================================================== |
| 91 | +===================== ================= =================== ============================================ ======================================== |
| 92 | +Concept Scalar Class Array Class pandas Data Type Primary Creation Method |
| 93 | +===================== ================= =================== ============================================ ======================================== |
| 94 | +Date times ``Timestamp`` ``DatetimeIndex`` ``datetime64[ns]`` or ``datetime64[ns, tz]`` ``to_datetime`` or ``date_range`` |
| 95 | +Time deltas ``Timedelta`` ``TimedeltaIndex`` ``timedelta64[ns]`` ``to_timedelta`` or ``timedelta_range`` |
| 96 | +Time spans ``Period`` ``PeriodIndex`` ``period[freq]`` ``Period`` or ``period_range`` |
| 97 | +Date offsets ``DateOffset`` ``None`` ``None`` ``DateOffset`` |
| 98 | +===================== ================= =================== ============================================ ======================================== |
| 99 | + |
| 100 | +For time series data, it's conventional to represent the time component in the index of a :class:`Series` or :class:`DataFrame` |
| 101 | +so manipulations can be performed with respect to the time element. |
| 102 | + |
| 103 | +.. ipython:: python |
| 104 | +
|
| 105 | + pd.Series(range(3), index=pd.date_range('2000', freq='D', periods=3)) |
| 106 | +
|
| 107 | +However, :class:`Series` and :class:`DataFrame` can directly also support the time component as data itself. |
| 108 | + |
| 109 | +.. ipython:: python |
| 110 | +
|
| 111 | + pd.Series(pd.date_range('2000', freq='D', periods=3)) |
| 112 | +
|
| 113 | +:class:`Series` and :class:`DataFrame` have extended data type support and functionality for ``datetime`` and ``timedelta`` |
| 114 | +data when the time data is used as data itself. The ``Period`` and ``DateOffset`` data will be stored as ``object`` data. |
| 115 | + |
| 116 | +.. ipython:: python |
| 117 | +
|
| 118 | + pd.Series(pd.period_range('1/1/2011', freq='M', periods=3)) |
| 119 | + pd.Series(pd.date_range('1/1/2011', freq='M', periods=3)) |
| 120 | +
|
| 121 | +Lastly, pandas represents null date times, time deltas, and time spans as ``NaT`` which |
| 122 | +is useful for representing missing or null date like values and behaves similar |
| 123 | +as ``np.nan`` does for float data. |
| 124 | + |
| 125 | +.. ipython:: python |
| 126 | +
|
| 127 | + pd.Timestamp(pd.NaT) |
| 128 | + pd.Timedelta(pd.NaT) |
| 129 | + pd.Period(pd.NaT) |
| 130 | + # Equality acts as np.nan would |
| 131 | + pd.NaT == pd.NaT |
87 | 132 |
|
88 | 133 | .. _timeseries.representation:
|
89 | 134 |
|
@@ -1443,7 +1488,7 @@ time. The method for this is :meth:`~Series.shift`, which is available on all of
|
1443 | 1488 | the pandas objects.
|
1444 | 1489 |
|
1445 | 1490 | .. ipython:: python
|
1446 |
| -
|
| 1491 | + ts = pd.Series(range(len(rng)), index=rng) |
1447 | 1492 | ts = ts[:5]
|
1448 | 1493 | ts.shift(1)
|
1449 | 1494 |
|
|
0 commit comments