Rolling.apply documentation not clear #25656

js3711 · 2019-03-11T13:07:05Z

The current documentation for rolling.apply states:

func : function
Must produce a single value from an ndarray input if raw=True or a Series if raw=False.

This makes it sound like a series consisting of multiple values can be returned by the apply func. This is not true. The following error results:

Pandas v 0.24.1

samples = roll.apply(lambda x: pd.Series([1]), raw=True)
example = pd.Series(np.random.rand(10000,))
example.rolling(100).apply(lambda x: pd.Series([1,2,3,4]), raw=False)

~/anaconda3/envs/pandas/core/series.py in wrapper(self)
     91             return converter(self.iloc[0])
     92         raise TypeError("cannot convert the series to "
---> 93                         "{0}".format(str(converter)))
     94 
     95     wrapper.__name__ = "__{name}__".format(name=converter.__name__)
TypeError: cannot convert the series to <class 'float'>

WillAyd · 2019-03-11T17:34:35Z

The func argument is documented as follows:

Must produce a single value from an ndarray input if raw=True or a Series if raw=False.

So the problem is that your expression doesn't reduce to a scalar.

What distinction are you trying to make with Series here? This will fail if you replace with a NumPy array as well (though admittedly with a different error message)

js3711 · 2019-03-13T17:02:42Z

Yup, that makes it clearer. It initially read as though multiple values could be returned through the Series

lycanthropes · 2019-03-16T02:31:37Z

The func argument is documented as follows:
Must produce a single value from an ndarray input if raw=True or a Series if raw=False.
So the problem is that your expression doesn't reduce to a scalar.

What distinction are you trying to make with Series here? This will fail if you replace with a NumPy array as well (though admittedly with a different error message)

I want to realize some ideas like this:
df.groupby('stock_code').rolling(window=12).apply(wavg,'column1',weight_variable)
this means applying a given weight variable to a rolling column (here it is df.column1). But I can not design a available wavg function .My wavg function is:
def wavg(group, avg_name):
d = group[avg_name]
w = pd.Series(data=[0,0,0.5,0,0,0.5,0,0,0.5,0,0,1])
d=d.reset_index(drop=True)
try:
return (d * w).sum() / w.sum()
except ZeroDivisionError:
return np.nan
But when I run the code : df.groupby('stock_code').rolling(window=12).apply(wavg,'column1',weight_variable), it always return a TypeError: wavg() missing 1 required positional argument: 'avg_name'

WillAyd added Usage Question Error Reporting Incorrect or improved errors from pandas labels Mar 11, 2019

bharatr21 mentioned this issue Mar 13, 2019

Make Rolling.apply documentation clearer #25712

Merged

3 tasks

WillAyd closed this as completed in #25712 Mar 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rolling.apply documentation not clear #25656

Rolling.apply documentation not clear #25656

js3711 commented Mar 11, 2019

WillAyd commented Mar 11, 2019

js3711 commented Mar 13, 2019

lycanthropes commented Mar 16, 2019 •

edited

Loading

Rolling.apply documentation not clear #25656

Rolling.apply documentation not clear #25656

Comments

js3711 commented Mar 11, 2019

WillAyd commented Mar 11, 2019

js3711 commented Mar 13, 2019

lycanthropes commented Mar 16, 2019 • edited Loading

lycanthropes commented Mar 16, 2019 •

edited

Loading