@@ -68,7 +68,7 @@ Resample:
68
68
.. ipython :: python
69
69
70
70
# Daily means
71
- ts.resample(' D' , how = ' mean' )
71
+ ts.resample(' D' ). mean( )
72
72
73
73
74
74
.. _timeseries.overview :
@@ -1211,6 +1211,11 @@ Converting to Python datetimes
1211
1211
Resampling
1212
1212
----------
1213
1213
1214
+ .. warning ::
1215
+
1216
+ The interface to ``.resample `` has changed in 0.18.0 to be more groupby-like and hence more flexible.
1217
+ See the :ref: `whatsnew docs <whatsnew_0180.breaking.resample >` for a comparison with prior versions.
1218
+
1214
1219
Pandas has a simple, powerful, and efficient functionality for
1215
1220
performing resampling operations during frequency conversion (e.g., converting
1216
1221
secondly data into 5-minutely data). This is extremely common in, but not
@@ -1226,7 +1231,7 @@ See some :ref:`cookbook examples <cookbook.resample>` for some advanced strategi
1226
1231
1227
1232
ts = Series(randint(0 , 500 , len (rng)), index = rng)
1228
1233
1229
- ts.resample(' 5Min' , how = ' sum' )
1234
+ ts.resample(' 5Min' ). sum( )
1230
1235
1231
1236
The ``resample `` function is very flexible and allows you to specify many
1232
1237
different parameters to control the frequency conversion and resampling
@@ -1237,11 +1242,11 @@ an array and produces aggregated values:
1237
1242
1238
1243
.. ipython :: python
1239
1244
1240
- ts.resample(' 5Min' ) # default is mean
1245
+ ts.resample(' 5Min' ). mean()
1241
1246
1242
- ts.resample(' 5Min' , how = ' ohlc' )
1247
+ ts.resample(' 5Min' ). ohlc( )
1243
1248
1244
- ts.resample(' 5Min' , how = np .max)
1249
+ ts.resample(' 5Min' ) .max( )
1245
1250
1246
1251
Any function available via :ref: `dispatching <groupby.dispatch >` can be given to
1247
1252
the ``how `` parameter by name, including ``sum ``, ``mean ``, ``std ``, ``sem ``,
@@ -1252,9 +1257,9 @@ end of the interval is closed:
1252
1257
1253
1258
.. ipython :: python
1254
1259
1255
- ts.resample(' 5Min' , closed = ' right' )
1260
+ ts.resample(' 5Min' , closed = ' right' ).mean()
1256
1261
1257
- ts.resample(' 5Min' , closed = ' left' )
1262
+ ts.resample(' 5Min' , closed = ' left' ).mean()
1258
1263
1259
1264
Parameters like ``label `` and ``loffset `` are used to manipulate the resulting
1260
1265
labels. ``label `` specifies whether the result is labeled with the beginning or
@@ -1263,11 +1268,11 @@ labels.
1263
1268
1264
1269
.. ipython :: python
1265
1270
1266
- ts.resample(' 5Min' ) # by default label='right'
1271
+ ts.resample(' 5Min' ).mean() # by default label='right'
1267
1272
1268
- ts.resample(' 5Min' , label = ' left' )
1273
+ ts.resample(' 5Min' , label = ' left' ).mean()
1269
1274
1270
- ts.resample(' 5Min' , label = ' left' , loffset = ' 1s' )
1275
+ ts.resample(' 5Min' , label = ' left' , loffset = ' 1s' ).mean()
1271
1276
1272
1277
The ``axis `` parameter can be set to 0 or 1 and allows you to resample the
1273
1278
specified axis for a DataFrame.
@@ -1284,18 +1289,17 @@ frequency periods.
1284
1289
Up Sampling
1285
1290
~~~~~~~~~~~
1286
1291
1287
- For upsampling, the ``fill_method `` and ``limit `` parameters can be specified
1288
- to interpolate over the gaps that are created:
1292
+ For upsampling, you can specify a way to upsample and the ``limit `` parameter to interpolate over the gaps that are created:
1289
1293
1290
1294
.. ipython :: python
1291
1295
1292
1296
# from secondly to every 250 milliseconds
1293
1297
1294
- ts[:2 ].resample(' 250L' )
1298
+ ts[:2 ].resample(' 250L' ).reindex()
1295
1299
1296
- ts[:2 ].resample(' 250L' , fill_method = ' pad ' )
1300
+ ts[:2 ].resample(' 250L' ).ffill( )
1297
1301
1298
- ts[:2 ].resample(' 250L' , fill_method = ' pad ' , limit = 2 )
1302
+ ts[:2 ].resample(' 250L' ).ffill( limit = 2 )
1299
1303
1300
1304
Sparse Resampling
1301
1305
~~~~~~~~~~~~~~~~~
@@ -1317,7 +1321,7 @@ If we want to resample to the full range of the series
1317
1321
1318
1322
.. ipython :: python
1319
1323
1320
- ts.resample(' 3T' , how = ' sum' )
1324
+ ts.resample(' 3T' ). sum( )
1321
1325
1322
1326
We can instead only resample those groups where we have points as follows:
1323
1327
@@ -1333,6 +1337,74 @@ We can instead only resample those groups where we have points as follows:
1333
1337
1334
1338
ts.groupby(partial(round , freq = ' 3T' )).sum()
1335
1339
1340
+ Aggregation
1341
+ ~~~~~~~~~~~
1342
+
1343
+ Similar to :ref: `groupby aggregates <groupby.aggregate >` and the :ref: `window functions <stats.aggregate >`, a ``Resampler `` can be selectively
1344
+ resampled.
1345
+
1346
+ Resampling a ``DataFrame ``, the default will be to act on all columns with the same function.
1347
+
1348
+ .. ipython :: python
1349
+
1350
+ df = pd.DataFrame(np.random.randn(1000 , 3 ),
1351
+ index = pd.date_range(' 1/1/2012' , freq = ' S' , periods = 1000 ),
1352
+ columns = [' A' , ' B' , ' C' ])
1353
+ r = df.resample(' 3T' )
1354
+ r.mean()
1355
+
1356
+ We can select a specific column or columns using standard getitem.
1357
+
1358
+ .. ipython :: python
1359
+
1360
+ r[' A' ].mean()
1361
+
1362
+ r[[' A' ,' B' ]].mean()
1363
+
1364
+ You can pass a list or dict of functions to do aggregation with, outputting a DataFrame:
1365
+
1366
+ .. ipython :: python
1367
+
1368
+ r[' A' ].agg([np.sum, np.mean, np.std])
1369
+
1370
+ If a dict is passed, the keys will be used to name the columns. Otherwise the
1371
+ function's name (stored in the function object) will be used.
1372
+
1373
+ .. ipython :: python
1374
+
1375
+ r[' A' ].agg({' result1' : np.sum,
1376
+ ' result2' : np.mean})
1377
+
1378
+ On a resampled DataFrame, you can pass a list of functions to apply to each
1379
+ column, which produces an aggregated result with a hierarchical index:
1380
+
1381
+ .. ipython :: python
1382
+
1383
+ r.agg([np.sum, np.mean])
1384
+
1385
+ By passing a dict to ``aggregate `` you can apply a different aggregation to the
1386
+ columns of a DataFrame:
1387
+
1388
+ .. ipython :: python
1389
+ :okexcept:
1390
+
1391
+ r.agg({' A' : np.sum,
1392
+ ' B' : lambda x : np.std(x, ddof = 1 )})
1393
+
1394
+ The function names can also be strings. In order for a string to be valid it
1395
+ must be implemented on the Resampled object
1396
+
1397
+ .. ipython :: python
1398
+
1399
+ r.agg({' A' : ' sum' , ' B' : ' std' })
1400
+
1401
+ Furthermore you can pass a nested dict to indicate different aggregations on different columns.
1402
+
1403
+ .. ipython :: python
1404
+
1405
+ r.agg({' A' : [' sum' ,' std' ], ' B' : [' mean' ,' std' ] })
1406
+
1407
+
1336
1408
.. _timeseries.periods :
1337
1409
1338
1410
Time Span Representation
0 commit comments