API: change .resample to be a groupby-like API #11732

jreback · 2015-12-01T12:17:44Z

similar to #11603

this would transform:

s.resample('D',how='max')

to

s.resample('D').max()

This would be a breaking API change, as the default is how='mean', meaning, that s.resample('D') returns the mean of the resampled data. However it would be visible at the very least and not simply change working code.

This would bring .resample (which is just a groupby type operation under the hood anyhow) into the API syntax for .groupby and .rolling et. al.

Furthermore this would allow geitem / aggregate type operations with minimal effort
e.g.

s.resample('D').agg(['min','max'])

The text was updated successfully, but these errors were encountered:

shoyer · 2015-12-01T19:48:21Z

This change would also eliminate the need many of the current use cases for pd.TimeGrouper, which is a nice thing because that API is pretty well hidden right now.

This API will work well for downsampling (to a coarser time resolution), but it's not clear to me how it would work for upsampling or combined down/upsampling. For example, how would you upsample from daily to hourly data using forward filling with the new API? s.resample('H').mean(fill_method='pad')? Using a method like mean is a bit confusing in this context.

jreback · 2015-12-01T20:21:39Z

s.resample('H').pad()

jreback · 2015-12-01T20:39:48Z

I am not sure that combined up/downsampling is even possible now?

jreback · 2015-12-01T21:37:55Z

or maybe to be more in-line

s.resample('H').ffill()
s.resample('H').fillna(method='pad')

(or all the above)

I guess

s.upsample('H').ffill() is also possible :)

shoyer · 2015-12-01T22:39:09Z

Here's a simple example of combined up/downsampling:

In [25]: idx = pd.to_datetime(['2000-01-01T06', '2000-01-01T12', '2000-01-03T00'])

In [26]: s = pd.Series(range(3), idx)

In [27]: s
Out[27]:
2000-01-01 06:00:00    0
2000-01-01 12:00:00    1
2000-01-03 00:00:00    2
dtype: int64

In [28]: s.resample('1D')
Out[28]:
2000-01-01    0.5
2000-01-02    NaN
2000-01-03    2.0
Freq: D, dtype: float64

In [29]: s.resample('1D', fill_method='pad')
Out[29]:
2000-01-01    0.5
2000-01-02    0.5
2000-01-03    2.0
Freq: D, dtype: float64

jreback · 2015-12-01T23:54:01Z

I suppose we could have an optional fill_method kw in the Resample object
e.g. in s.resample('D',fill_method='pad') if necessary (similar to how .reindex has this, but normally you would do a: .reindex().ffill()

e.g.

In [23]: s.resample('1D',how='mean').ffill()
Out[23]: 
2000-01-01    0.5
2000-01-02    0.5
2000-01-03    2.0
Freq: D, dtype: float64

which I would do like:
s.resample('1D').mean().ffill()

I guess fill_method would apply while doing the mean intra-day I guess (though I don't think I can see a case for this).

jreback · 2015-12-02T03:30:32Z

POC

In [3]: s = Series(np.random.rand(1000), pd.date_range('20130101 09:00:00',freq='Min',periods=1000))

In [4]: r = s.resample2('H')

In [5]: r
Out[5]: DatetimeIndexResampler [freq-><Hour>,axis->0,closed->left,label->left,convention->start,base->0]

In [6]: r.
r.agg        r.aggregate  r.ax         r.mean       r.name       

In [6]: r.mean()
Out[6]: 
2013-01-01 09:00:00    0.463474
2013-01-01 10:00:00    0.496552
2013-01-01 11:00:00    0.467690
2013-01-01 12:00:00    0.542037
2013-01-01 13:00:00    0.500808
2013-01-01 14:00:00    0.541115
2013-01-01 15:00:00    0.549489
2013-01-01 16:00:00    0.567870
2013-01-01 17:00:00    0.466067
2013-01-01 18:00:00    0.468675
2013-01-01 19:00:00    0.520051
2013-01-01 20:00:00    0.495800
2013-01-01 21:00:00    0.496541
2013-01-01 22:00:00    0.437051
2013-01-01 23:00:00    0.514727
2013-01-02 00:00:00    0.517313
2013-01-02 01:00:00    0.501945
Freq: H, dtype: float64

original API detection & warning support for isinstance / numeric ops support for comparison ops DOC: documentation updates w.r.t. aggregation

jreback added API Design Resample resample method Difficulty Advanced labels Dec 1, 2015

jreback added this to the 0.18.0 milestone Dec 1, 2015

jreback changed the title ~~API: change .resample to be a groupby-like operation~~ API: change .resample to be a groupby-like API Dec 1, 2015

jreback mentioned this issue Dec 14, 2015

Refactored Resample API breaking change #11841

Closed

2 tasks

jreback added a commit to jreback/pandas that referenced this issue Dec 23, 2015

ENH: .resample API to groupby-like class, pandas-dev#11732

5b59fc0

jreback added a commit to jreback/pandas that referenced this issue Feb 2, 2016

ENH: .resample API to groupby-like class, pandas-dev#11732

e570570

original API detection & warning support for isinstance / numeric ops support for comparison ops DOC: documentation updates w.r.t. aggregation

jreback closed this as completed in 1dc49f5 Feb 2, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API: change .resample to be a groupby-like API #11732

API: change .resample to be a groupby-like API #11732

jreback commented Dec 1, 2015

shoyer commented Dec 1, 2015

jreback commented Dec 1, 2015

jreback commented Dec 1, 2015

jreback commented Dec 1, 2015

shoyer commented Dec 1, 2015

jreback commented Dec 1, 2015

jreback commented Dec 2, 2015

API: change .resample to be a groupby-like API #11732

API: change .resample to be a groupby-like API #11732

Comments

jreback commented Dec 1, 2015

shoyer commented Dec 1, 2015

jreback commented Dec 1, 2015

jreback commented Dec 1, 2015

jreback commented Dec 1, 2015

shoyer commented Dec 1, 2015

jreback commented Dec 1, 2015

jreback commented Dec 2, 2015