4
4
.. ipython :: python
5
5
:suppress:
6
6
7
+ from datetime import datetime
7
8
import numpy as np
8
9
np.random.seed(123456 )
9
10
from pandas import *
10
11
randn = np.random.randn
11
12
np.set_printoptions(precision = 4 , suppress = True )
12
13
from dateutil import relativedelta
13
- from pandas.core.datetools import *
14
+ from pandas.tseries.api import *
14
15
15
16
********************************
16
17
Time Series / Date functionality
17
18
********************************
18
19
19
20
pandas has proven very successful as a tool for working with time series data,
20
- especially in the financial data analysis space. Over the coming year we will
21
- be looking to consolidate the various Python libraries for time series data,
22
- e.g. `` scikits.timeseries ``, using the new NumPy ``datetime64 `` dtype, to
23
- create a very nice integrated solution. Everything in pandas at the moment is
24
- based on using Python `` datetime `` objects .
21
+ especially in the financial data analysis space. With the 0.8 release, we have
22
+ further improved the time series API in pandas by leaps and bounds. Using the
23
+ new NumPy ``datetime64 `` dtype, we have consolidated a large number of features
24
+ from other Python libraries like `` scikits.timeseries `` as well as created
25
+ a tremendous amount of new functionality for manipulating time series data .
25
26
26
27
In working with time series data, we will frequently seek to:
27
28
28
- - generate sequences of fixed-frequency dates
29
+ - generate sequences of fixed-frequency dates and time spans
29
30
- conform or convert time series to a particular frequency
30
31
- compute "relative" dates based on various non-standard time increments
31
32
(e.g. 5 business days before the last business day of the year), or "roll"
@@ -34,18 +35,85 @@ In working with time series data, we will frequently seek to:
34
35
pandas provides a relatively compact and self-contained set of tools for
35
36
performing the above tasks.
36
37
37
- .. note ::
38
+ .. _timeseries.representation :
39
+
40
+ Time Stamps vs. Time Spans
41
+ --------------------------
42
+
43
+ While most time series representations of data associates values with a time
44
+ stamp, in many cases it is more natural to associate the values with a given
45
+ time span. For example, it is easy to think of level variables at a
46
+ particular point in time, but much more intuitive to think of change variables
47
+ over spans of time. Starting with 0.8, pandas allows you to capture both
48
+ representations and convert between them. Under the hood, pandas represents
49
+ timestamps using instances of ``Timestamp `` and sequences of timestamps using
50
+ instances of ``DatetimeIndex ``. For regular time spans, pandas uses ``Period ``
51
+ objects for scalar values and ``PeriodIndex `` for sequences of spans.
52
+ Better support for irregular intervals with arbitrary start and end points are
53
+ forth-coming in future releases.
54
+
55
+ For example:
56
+
57
+ .. ipython :: python
58
+
59
+ # Time stamped data
60
+ dates = [datetime(2012 , 5 , 1 ), datetime(2012 , 5 , 2 ), datetime(2012 , 5 , 3 )]
61
+ ts = Series(np.random.randn(3 ), dates)
62
+
63
+ type (ts.index)
64
+
65
+ ts
66
+
67
+ # Time span data
68
+ periods = PeriodIndex([Period(' 2012-01' ), Period(' 2012-02' ),
69
+ Period(' 2012-03' )])
70
+ ts = Series(np.random.randn(3 ), periods)
71
+
72
+ type (ts.index)
73
+
74
+ ts
75
+
76
+ .. _timeseries.timestamprange :
77
+
78
+ Generating Ranges of Timestamps
79
+ -------------------------------
80
+
81
+ To generate an index with time stamps, you can use either the DatetimeIndex or
82
+ Index constructor and pass in a list of datetime objects:
38
83
39
- This area of pandas has gotten less development attention recently, though
40
- this should change in the near future.
84
+ .. ipython :: python
85
+
86
+ dates = [datetime(2012 , 5 , 1 ), datetime(2012 , 5 , 2 ), datetime(2012 , 5 , 3 )]
87
+ index = DatetimeIndex(dates)
88
+ index # Note the frequency information
89
+
90
+ index = Index(dates)
91
+ index # Automatically converted to DatetimeIndex
92
+
93
+ Practically, this becomes very cumbersome because we often need a very long
94
+ index with a large number of timestamps. If we need timestamps on a regular
95
+ frequency, we can use the pandas functions ``date_range `` and ``bdate_range ``
96
+ to create timestamp indexes.
97
+
98
+ .. ipython :: python
99
+
100
+ index = date_range(' 2000-1-1' , periods = 1000 , freq = ' M' )
101
+ index
102
+
103
+ index = bdate_range(' 2012-1-1' , periods = 250 )
104
+ index
41
105
42
106
.. _timeseries.offsets :
43
107
44
108
DateOffset objects
45
109
------------------
46
110
47
- A ``DateOffset `` instance represents a frequency increment. Different offset
48
- logic via subclasses:
111
+ In order to create the sequence of dates with a monthly frequency in the
112
+ previous example, we used the ``freq `` keyword and gave it 'M' as the input.
113
+ Under the hood, the string 'M' is being interpreted into an instance of pandas
114
+ ``DateOffset ``. ``DateOffset `` represents a regular frequency increment.
115
+ Specific offset logic like "business day" or "one hour" is represented in its
116
+ various subclasses.
49
117
50
118
.. csv-table ::
51
119
:header: "Class name", "Description"
@@ -54,16 +122,24 @@ logic via subclasses:
54
122
DateOffset, "Generic offset class, defaults to 1 calendar day"
55
123
BDay, "business day (weekday)"
56
124
Week, "one week, optionally anchored on a day of the week"
125
+ WeekOfMonth, "the x-th day of the y-th week of each month"
57
126
MonthEnd, "calendar month end"
127
+ MonthBegin, "calendar month begin"
58
128
BMonthEnd, "business month end"
129
+ BMonthBegin, "business month begin"
59
130
QuarterEnd, "calendar quarter end"
131
+ QuarterBegin, "calendar quarter begin"
60
132
BQuarterEnd, "business quarter end"
133
+ BQuarterBegin, "business quarter begin"
61
134
YearEnd, "calendar year end"
62
135
YearBegin, "calendar year begin"
63
136
BYearEnd, "business year end"
137
+ BYearBegin, "business year begin"
64
138
Hour, "one hour"
65
139
Minute, "one minute"
66
140
Second, "one second"
141
+ Milli, "one millisecond"
142
+ Micro, "one microsecond"
67
143
68
144
The basic ``DateOffset `` takes the same arguments as
69
145
``dateutil.relativedelta ``, which works like:
@@ -113,7 +189,7 @@ The ``rollforward`` and ``rollback`` methods do exactly what you would expect:
113
189
offset.rollforward(d)
114
190
offset.rollback(d)
115
191
116
- It's definitely worth exploring the ``pandas.core.datetools `` module and the
192
+ It's definitely worth exploring the ``pandas.tseries.offsets `` module and the
117
193
various docstrings for the classes.
118
194
119
195
Parametric offsets
@@ -130,7 +206,14 @@ particular day of the week:
130
206
d + Week(weekday = 4 )
131
207
(d + Week(weekday = 4 )).weekday()
132
208
133
- .. _timeseries.freq :
209
+ Another example is parameterizing ``YearEnd `` with the specific ending month:
210
+
211
+ .. ipython :: python
212
+
213
+ d + YearEnd()
214
+ d + YearEnd(month = 6 )
215
+
216
+ .. _timeseries.alias :
134
217
135
218
Offset Aliases
136
219
~~~~~~~~~~~~~~
@@ -202,9 +285,9 @@ For some frequencies you can specify an anchoring suffix:
202
285
"(B)A(S)\- OCT", "annual frequency, anchored end of October"
203
286
"(B)A(S)\- NOV", "annual frequency, anchored end of November"
204
287
205
- These can be used as arguments to ``date_range ``, ``period_range ``, constructors
206
- for ``PeriodIndex `` and `` DatetimeIndex ``, as well as various other time
207
- series-related functions in pandas.
288
+ These can be used as arguments to ``date_range ``, ``bdate_range ``, constructors
289
+ for ``DatetimeIndex ``, as well as various other timeseries-related functions
290
+ in pandas.
208
291
209
292
Note that prior to v0.8.0, time rules had a slightly different look. Pandas
210
293
will continue to support the legacy time rules for the time being but it is
@@ -242,56 +325,63 @@ strongly recommended that you switch to using the new offset aliases.
242
325
"ms", "L"
243
326
"us": "U"
244
327
245
- Note that the legacy quarterly and annual frequencies are business quarter and
246
- business year ends. Also note the legacy time rule for milliseconds `` ms ``
247
- versus the new offset alias for month start ``MS ``. This means that offset
248
- alias parsing is case sensitive.
328
+ As you can see, legacy quarterly and annual frequencies are business quarter
329
+ and business year ends. Please also note the legacy time rule for milliseconds
330
+ `` ms `` versus the new offset alias for month start ``MS ``. This means that
331
+ offset alias parsing is case sensitive.
249
332
250
333
.. _timeseries.daterange :
251
334
252
- Generating date ranges (date_range)
253
- -----------------------------------
335
+ More on date ranges
336
+ -------------------
254
337
255
- The ``date_range `` class utilizes these offsets (and any ones that we might add)
256
- to generate fixed-frequency date ranges:
338
+ Convenience functions like ``date_range `` and ``bdate_range `` utilizes the
339
+ offsets described above to generate fixed-frequency date ranges. The default
340
+ frequency for ``date_range `` is a **calendar day ** while the default for
341
+ ``bdate_range `` is a **business day **
257
342
258
343
.. ipython :: python
259
344
260
345
start = datetime(2009 , 1 , 1 )
261
346
end = datetime(2010 , 1 , 1 )
262
347
263
- rng = date_range(start, end, freq = BDay())
348
+ rng = date_range(start, end)
349
+ rng
350
+
351
+ rng = bdate_range(start, end)
264
352
rng
353
+
354
+ ``date_range `` and ``bdate_range `` makes it easy to generate a range of dates
355
+ using various combinations of its parameters like ``start ``, ``end ``,
356
+ ``periods ``, and ``freq ``:
357
+
265
358
date_range(start, end, freq=BMonthEnd())
266
359
267
- **Business day frequency ** is the default for ``date_range ``. You can also
268
- strictly generate a ``date_range `` of a certain length by providing either a
269
- start or end date and a ``periods `` argument:
360
+ date_range(start, end, freq=3 * Week())
270
361
271
- .. ipython :: python
362
+ bdate_range(end=end, periods=20)
272
363
273
- date_range(start, periods = 20 )
274
- date_range(end = end, periods = 20 )
364
+ bdate_range(start=start, periods=20)
275
365
276
366
The start and end dates are strictly inclusive. So it will not generate any
277
367
dates outside of those dates if specified.
278
368
279
- date_range is a valid Index
280
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~
281
369
282
- One of the main uses for ``date_range `` is as an index for pandas objects. When
283
- working with a lot of time series data, there are several reasons to use
284
- ``date_range `` objects when possible:
370
+ DatetimeIndex
371
+ ~~~~~~~~~~~~~
372
+
373
+ One of the main uses for ``DatetimeIndex `` is as an index for pandas objects.
374
+ The ``DatetimeIndex `` class contains many timeseries related optimizations:
285
375
286
376
- A large range of dates for various offsets are pre-computed and cached
287
377
under the hood in order to make generating subsequent date ranges very fast
288
378
(just have to grab a slice)
289
- - Fast shifting using the ``shift `` method on pandas objects
290
- - Unioning of overlapping date_range objects with the same frequency is very
291
- fast (important for fast data alignment)
379
+ - Fast shifting using the ``shift `` and `` tshift `` method on pandas objects
380
+ - Unioning of overlapping DatetimeIndex objects with the same frequency is
381
+ very fast (important for fast data alignment)
292
382
293
- The `` date_range `` is a valid index and can even be intelligent when doing
294
- slicing, etc.
383
+ `` DatetimeIndex `` can be used like a regular index and offers all of its
384
+ intelligent functionality like selection, slicing, etc.
295
385
296
386
.. ipython :: python
297
387
@@ -301,8 +391,8 @@ slicing, etc.
301
391
ts[:5 ].index
302
392
ts[::2 ].index
303
393
304
- More complicated fancy indexing will result in an `` Index `` that is no longer a
305
- `` date_range ``, however :
394
+ However, complicated fancy indexing that breaks the DatetimeIndex's frequency
395
+ regularity will result in an `` Index `` that is no longer a `` DatetimeIndex `` :
306
396
307
397
.. ipython :: python
308
398
@@ -335,7 +425,7 @@ and in Panel along the ``major_axis``.
335
425
336
426
The shift method accepts an ``offset `` argument which can accept a
337
427
``DateOffset `` class or other ``timedelta ``-like object or also a :ref: `time
338
- rule <timeseries.timerule >`:
428
+ rule <timeseries.alias >`:
339
429
340
430
.. ipython :: python
341
431
0 commit comments