Skip to content

Commit d9331ff

Browse files
committed
Add scaffold docstring for groupby.DataFrameGroupBy.resample.
1 parent 985ebd4 commit d9331ff

File tree

1 file changed

+63
-144
lines changed

1 file changed

+63
-144
lines changed

pandas/core/groupby.py

Lines changed: 63 additions & 144 deletions
Original file line numberDiff line numberDiff line change
@@ -1482,11 +1482,13 @@ def resample(self, rule, *args, **kwargs):
14821482
Parameters
14831483
----------
14841484
rule : str
1485-
Medio.
1486-
args
1487-
Hola.
1488-
kwargs
1489-
Chau.
1485+
The offset string or object representing target conversion.
1486+
*args
1487+
These parameters will be passed to the get_resampler_for_grouping
1488+
function.
1489+
**kwargs
1490+
These parameters will be passed to the get_resampler_for_grouping
1491+
function.
14901492
14911493
Returns
14921494
-------
@@ -1496,148 +1498,65 @@ def resample(self, rule, *args, **kwargs):
14961498
Examples
14971499
--------
14981500
1499-
Start by creating a series with 9 one minute timestamps.
1501+
Start by creating a DataFrame with 9 one minute timestamps.
15001502
>>> index = pd.date_range('1/1/2000', periods=9, freq='T')
1501-
>>> series = pd.Series(range(9), index=index)
1502-
>>> series
1503-
2000-01-01 00:00:00 0
1504-
2000-01-01 00:01:00 1
1505-
2000-01-01 00:02:00 2
1506-
2000-01-01 00:03:00 3
1507-
2000-01-01 00:04:00 4
1508-
2000-01-01 00:05:00 5
1509-
2000-01-01 00:06:00 6
1510-
2000-01-01 00:07:00 7
1511-
2000-01-01 00:08:00 8
1512-
Freq: T, dtype: int64
1513-
1514-
Downsample the series into 3 minute bins and sum the values
1515-
of the timestamps falling into a bin.
1516-
>>> series.resample('3T').sum()
1517-
2000-01-01 00:00:00 3
1518-
2000-01-01 00:03:00 12
1519-
2000-01-01 00:06:00 21
1520-
Freq: 3T, dtype: int64
1521-
1522-
Downsample the series into 3 minute bins as above, but label each
1523-
bin using the right edge instead of the left. Please note that the
1524-
value in the bucket used as the label is not included in the bucket,
1525-
which it labels. For example, in the original series the
1526-
bucket ``2000-01-01 00:03:00`` contains the value 3, but the summed
1527-
value in the resampled bucket with the label ``2000-01-01 00:03:00``
1528-
does not include 3 (if it did, the summed value would be 6, not 3).
1529-
To include this value close the right side of the bin interval as
1530-
illustrated in the example below this one.
1531-
>>> series.resample('3T', label='right').sum()
1532-
2000-01-01 00:03:00 3
1533-
2000-01-01 00:06:00 12
1534-
2000-01-01 00:09:00 21
1535-
Freq: 3T, dtype: int64
1536-
1537-
Downsample the series into 3 minute bins as above, but close the right
1538-
side of the bin interval.
1539-
>>> series.resample('3T', label='right', closed='right').sum()
1540-
2000-01-01 00:00:00 0
1541-
2000-01-01 00:03:00 6
1542-
2000-01-01 00:06:00 15
1543-
2000-01-01 00:09:00 15
1544-
Freq: 3T, dtype: int64
1545-
1546-
Upsample the series into 30 second bins.
1547-
>>> series.resample('30S').asfreq()[0:5] #select first 5 rows
1548-
2000-01-01 00:00:00 0.0
1549-
2000-01-01 00:00:30 NaN
1550-
2000-01-01 00:01:00 1.0
1551-
2000-01-01 00:01:30 NaN
1552-
2000-01-01 00:02:00 2.0
1553-
Freq: 30S, dtype: float64
1554-
1555-
Upsample the series into 30 second bins and fill the ``NaN``
1556-
values using the ``pad`` method.
1557-
>>> series.resample('30S').pad()[0:5]
1558-
2000-01-01 00:00:00 0
1559-
2000-01-01 00:00:30 0
1560-
2000-01-01 00:01:00 1
1561-
2000-01-01 00:01:30 1
1562-
2000-01-01 00:02:00 2
1563-
Freq: 30S, dtype: int64
1564-
1565-
Upsample the series into 30 second bins and fill the
1566-
``NaN`` values using the ``bfill`` method.
1567-
>>> series.resample('30S').bfill()[0:5]
1568-
2000-01-01 00:00:00 0
1569-
2000-01-01 00:00:30 1
1570-
2000-01-01 00:01:00 1
1571-
2000-01-01 00:01:30 2
1572-
2000-01-01 00:02:00 2
1573-
Freq: 30S, dtype: int64
1574-
1575-
Pass a custom function via ``apply``
1576-
>>> def custom_resampler(array_like):
1577-
... return np.sum(array_like)+5
1578-
>>> series.resample('3T').apply(custom_resampler)
1579-
2000-01-01 00:00:00 8
1580-
2000-01-01 00:03:00 17
1581-
2000-01-01 00:06:00 26
1582-
Freq: 3T, dtype: int64
1583-
1584-
For a Series with a PeriodIndex, the keyword `convention` can be
1585-
used to control whether to use the start or end of `rule`.
1586-
>>> s = pd.Series([1, 2], index=pd.period_range('2012-01-01', freq='A', periods=2))
1587-
>>> s
1588-
2012 1
1589-
2013 2
1590-
Freq: A-DEC, dtype: int64
1591-
1592-
Resample by month using 'start' `convention`. Values are assigned to
1593-
the first month of the period.
1594-
>>> s.resample('M', convention='start').asfreq().head()
1595-
2012-01 1.0
1596-
2012-02 NaN
1597-
2012-03 NaN
1598-
2012-04 NaN
1599-
2012-05 NaN
1600-
Freq: M, dtype: float64
1601-
1602-
Resample by month using 'end' `convention`. Values are assigned to
1603-
the last month of the period.
1604-
>>> s.resample('M', convention='end').asfreq()
1605-
2012-12 1.0
1606-
2013-01 NaN
1607-
2013-02 NaN
1608-
2013-03 NaN
1609-
2013-04 NaN
1610-
2013-05 NaN
1611-
2013-06 NaN
1612-
2013-07 NaN
1613-
2013-08 NaN
1614-
2013-09 NaN
1615-
2013-10 NaN
1616-
2013-11 NaN
1617-
2013-12 2.0
1618-
Freq: M, dtype: float64
1619-
1620-
For DataFrame objects, the keyword ``on`` can be used to specify the
1621-
column instead of the index for resampling.
1622-
>>> df = pd.DataFrame(data=9*[range(4)], columns=['a', 'b', 'c', 'd'])
1623-
>>> df['time'] = pd.date_range('1/1/2000', periods=9, freq='T')
1624-
>>> df.resample('3T', on='time').sum()
1503+
>>> df = pd.DataFrame(data=9*[range(4)],
1504+
... index=index,
1505+
... columns=['a', 'b', 'c', 'd'])
1506+
>>> df.iloc[[6], [0]] = 5 # change a value for grouping
1507+
>>> df
16251508
a b c d
1626-
time
1627-
2000-01-01 00:00:00 0 3 6 9
1628-
2000-01-01 00:03:00 0 3 6 9
1629-
2000-01-01 00:06:00 0 3 6 9
1630-
1631-
For a DataFrame with MultiIndex, the keyword ``level`` can be used to
1632-
specify on level the resampling needs to take place.
1633-
>>> time = pd.date_range('1/1/2000', periods=5, freq='T')
1634-
1635-
>>> df2 = pd.DataFrame(data=10*[range(4)], columns=['a', 'b', 'c', 'd'], index=pd.MultiIndex.from_product([time, [1, 2]]) )
1636-
>>> df2.resample('3T', level=0).sum()
1637-
a b c d
1638-
2000-01-01 00:00:00 0 6 12 18
1639-
2000-01-01 00:03:00 0 4 8 12
1509+
2000-01-01 00:00:00 0 1 2 3
1510+
2000-01-01 00:01:00 0 1 2 3
1511+
2000-01-01 00:02:00 0 1 2 3
1512+
2000-01-01 00:03:00 0 1 2 3
1513+
2000-01-01 00:04:00 0 1 2 3
1514+
2000-01-01 00:05:00 0 1 2 3
1515+
2000-01-01 00:06:00 5 1 2 3
1516+
2000-01-01 00:07:00 0 1 2 3
1517+
2000-01-01 00:08:00 0 1 2 3
1518+
1519+
>>> series = pd.Series(range(9), index=index) # delete this
1520+
1521+
Downsample the DataFrame into 3 minute bins and sum the values of
1522+
the timestamps falling into a bin.
1523+
>>> df.groupby('a').resample('3T').sum()
1524+
a b c d
1525+
a
1526+
0 2000-01-01 00:00:00 0 3 6 9
1527+
2000-01-01 00:03:00 0 3 6 9
1528+
2000-01-01 00:06:00 0 2 4 6
1529+
5 2000-01-01 00:06:00 5 1 2 3
16401530
1531+
Upsample the series into 30 second bins.
1532+
>>> df.groupby('a').resample('30S').sum()
1533+
a b c d
1534+
a
1535+
0 2000-01-01 00:00:00 0 1 2 3
1536+
2000-01-01 00:00:30 0 0 0 0
1537+
2000-01-01 00:01:00 0 1 2 3
1538+
2000-01-01 00:01:30 0 0 0 0
1539+
2000-01-01 00:02:00 0 1 2 3
1540+
2000-01-01 00:02:30 0 0 0 0
1541+
2000-01-01 00:03:00 0 1 2 3
1542+
2000-01-01 00:03:30 0 0 0 0
1543+
2000-01-01 00:04:00 0 1 2 3
1544+
2000-01-01 00:04:30 0 0 0 0
1545+
2000-01-01 00:05:00 0 1 2 3
1546+
2000-01-01 00:05:30 0 0 0 0
1547+
2000-01-01 00:06:00 0 0 0 0
1548+
2000-01-01 00:06:30 0 0 0 0
1549+
2000-01-01 00:07:00 0 1 2 3
1550+
2000-01-01 00:07:30 0 0 0 0
1551+
2000-01-01 00:08:00 0 1 2 3
1552+
5 2000-01-01 00:06:00 5 1 2 3
1553+
1554+
Resample by month. Values are assigned to the month of the period.
1555+
>>> df.groupby('a').resample('M').sum()
1556+
a b c d
1557+
a
1558+
0 2000-01-31 0 8 16 24
1559+
5 2000-01-31 5 1 2 3
16411560
"""
16421561
from pandas.core.resample import get_resampler_for_grouping
16431562
return get_resampler_for_grouping(self, rule, *args, **kwargs)

0 commit comments

Comments
 (0)