Skip to content

Commit f0c7c41

Browse files
jdjreback
authored andcommitted
timeseries: add tip about using groupby() rather than resample
As discussed in #11217, there's another way of doing resampling that is not yet covered by `resample' itself. Let's document that.
1 parent 2da34cd commit f0c7c41

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

doc/source/timeseries.rst

+23
Original file line numberDiff line numberDiff line change
@@ -1246,6 +1246,29 @@ previous versions, resampling had to be done using a combination of
12461246
function on the grouped object. This was not nearly as convenient or performant
12471247
as the new pandas timeseries API.
12481248

1249+
Sparse timeseries
1250+
~~~~~~~~~~~~~~~~~
1251+
1252+
If your timeseries are sparse, be aware that upsampling will generate a lot of
1253+
intermediate points filled with whatever passed as ``fill_method``. What
1254+
``resample`` does is basically a group by and then applying an aggregation
1255+
method on each of its groups, which can also be achieve with something like the
1256+
following.
1257+
1258+
.. ipython:: python
1259+
1260+
def round(t, freq):
1261+
# round a Timestamp to a specified freq
1262+
return Timestamp((t.value // freq.delta.value) * freq.delta.value)
1263+
1264+
from functools import partial
1265+
1266+
rng = date_range('1/1/2012', periods=100, freq='S')
1267+
1268+
ts = Series(randint(0, 500, len(rng)), index=rng)
1269+
1270+
ts.groupby(partial(round, freq=offsets.Minute(3))).sum()
1271+
12491272
.. _timeseries.periods:
12501273

12511274
Time Span Representation

0 commit comments

Comments
 (0)