Skip to content

DOC: update pandas.core.groupby.DataFrameGroupBy.resample docstring. #20374

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

134 changes: 132 additions & 2 deletions pandas/core/groupby/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -1460,8 +1460,138 @@ def describe(self, **kwargs):
@Appender(_doc_template)
def resample(self, rule, *args, **kwargs):
"""
Provide resampling when using a TimeGrouper
Return a new grouper with our resampler appended
Provide resampling when using a TimeGrouper.

Given a grouper the function resamples it according to a string
"string" -> "frequency".

See the :ref:`frequency aliases <timeseries.offset-aliases>`
documentation for more details.

Parameters
----------
rule : str or Offset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offset --> DateOffset

The offset string or object representing target grouper conversion.
*args, **kwargs
For compatibility with other groupby methods. See below for some
example parameters.
closed : {‘right’, ‘left’}
Which side of bin interval is closed.
label : {‘right’, ‘left’}
Which bin edge label to label bucket with.
loffset : timedelta
Adjust the resampled time labels.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parameters are not in the signature, are they the possible kwargs? If that's the case, we can add them as a list in the kwargs description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, the are just some samples of the kwargs, as they were requested.


Returns
-------
Grouper
Return a new grouper with our resampler appended.

Examples
--------
Start by creating a length-9 DataFrame with minute frequency.

>>> idx = pd.date_range('1/1/2000', periods=9, freq='T')
>>> df = pd.DataFrame(data=9 * [range(4)],
... index=idx,
... columns=['a', 'b', 'c', 'd'])
>>> df.iloc[6, 0] = 5
>>> df
a b c d
2000-01-01 00:00:00 0 1 2 3
2000-01-01 00:01:00 0 1 2 3
2000-01-01 00:02:00 0 1 2 3
2000-01-01 00:03:00 0 1 2 3
2000-01-01 00:04:00 0 1 2 3
2000-01-01 00:05:00 0 1 2 3
2000-01-01 00:06:00 5 1 2 3
2000-01-01 00:07:00 0 1 2 3
2000-01-01 00:08:00 0 1 2 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it'd be possible to use a more compact example. May be 4 rows of 1 minute intervals, that can be downsampled to 2 rows of 30 seconds? Also, I think 2 columns should be enough.


Downsample the DataFrame into 3 minute bins and sum the values of
the timestamps falling into a bin.

>>> df.groupby('a').resample('3T').sum()
a b c d
a
0 2000-01-01 00:00:00 0 3 6 9
2000-01-01 00:03:00 0 3 6 9
2000-01-01 00:06:00 0 2 4 6
5 2000-01-01 00:06:00 5 1 2 3

Upsample the series into 30 second bins.

>>> df.groupby('a').resample('30S').sum()
a b c d
a
0 2000-01-01 00:00:00 0 1 2 3
2000-01-01 00:00:30 0 0 0 0
2000-01-01 00:01:00 0 1 2 3
2000-01-01 00:01:30 0 0 0 0
2000-01-01 00:02:00 0 1 2 3
2000-01-01 00:02:30 0 0 0 0
2000-01-01 00:03:00 0 1 2 3
2000-01-01 00:03:30 0 0 0 0
2000-01-01 00:04:00 0 1 2 3
2000-01-01 00:04:30 0 0 0 0
2000-01-01 00:05:00 0 1 2 3
2000-01-01 00:05:30 0 0 0 0
2000-01-01 00:06:00 0 0 0 0
2000-01-01 00:06:30 0 0 0 0
2000-01-01 00:07:00 0 1 2 3
2000-01-01 00:07:30 0 0 0 0
2000-01-01 00:08:00 0 1 2 3
5 2000-01-01 00:06:00 5 1 2 3

Resample by month. Values are assigned to the month of the period.

>>> df.groupby('a').resample('M').sum()
a b c d
a
0 2000-01-31 0 8 16 24
5 2000-01-31 5 1 2 3

Downsample the series into 3 minute bins as above, but close the right
side of the bin interval.

>>> df.groupby('a').resample('3T', closed='right').sum()
a b c d
a
0 1999-12-31 23:57:00 0 1 2 3
2000-01-01 00:00:00 0 3 6 9
2000-01-01 00:03:00 0 2 4 6
2000-01-01 00:06:00 0 2 4 6
5 2000-01-01 00:03:00 5 1 2 3

Downsample the series into 3 minute bins and close the right side of
the bin interval, but label each bin using the right edge instead of
the left.

>>> df.groupby('a').resample('3T', closed='right', label='right').sum()
a b c d
a
0 2000-01-01 00:00:00 0 1 2 3
2000-01-01 00:03:00 0 3 6 9
2000-01-01 00:06:00 0 2 4 6
2000-01-01 00:09:00 0 2 4 6
5 2000-01-01 00:06:00 5 1 2 3

Add an offset of twenty seconds.

>>> df.groupby('a').resample('3T', loffset='20s').sum()
a b c d
a
0 2000-01-01 00:00:20 0 3 6 9
2000-01-01 00:03:20 0 3 6 9
2000-01-01 00:06:20 0 2 4 6
5 2000-01-01 00:06:20 5 1 2 3

See Also
--------
pandas.Grouper : specify a frequency to resample with when
grouping by a key.
DatetimeIndex.resample : Frequency conversion and resampling of
time series.
"""
from pandas.core.resample import get_resampler_for_grouping
return get_resampler_for_grouping(self, rule, *args, **kwargs)
Expand Down