-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Table-wise rolling / expanding / EWM function application #15095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
can u put up a simple example with the various options exercised? (e.g. simulate the output) |
Updated with an example. I also changed the suggested API: Before I had df.rolling(n, axis=None).apply(f) But really it should be df.rolling(n).apply(f, axis=None). The |
@TomAugspurger correct me if I am wrong, but what you really want is for
? but apply is pretty generic so we don't know what the user wants (but the original implementation was a single column) |
You're correct. This should make things clear In [9]: def f(x):
...: print(x)
...: return 0
In [8]: df = pd.DataFrame(np.arange(9).reshape(3, 3))
In [14]: df
Out[14]:
0 1 2
0 0 1 2
1 3 4 5
2 6 7 8
Currently, and the default in the future, this prints out In [10]: df.rolling(2).apply(f)
[ 0. 3.]
[ 3. 6.]
[ 1. 4.]
[ 4. 7.]
[ 2. 5.]
[ 5. 8.] With the new implementation and In [10]: df.rolling(2).apply(f, axis=None)
[[ 0 1, 2], # first window; 2x3 array
[ 3, 4, 5]]
[[ 3, 4, 5], # second window; 2x3 array
[6, 7, 8]] |
@TomAugspurger I know you used I think its better to follow our current model, IOW receive a DataFrame is very natural. This would be an API change, though even now I think we pass a another possibilty is to have |
I ran into a similar issue with a rolling function that uses OLS internally and needs to return more than one column (eg. the confidence interval). Would the test cases cover also Regarding API, I think the best way it should look like:
|
This is just an idea. You are welcome to submit a patch for this. |
I definitely agree with this - it fits well with everything else. So is the idea here that because apply() currently works column-wise and not dataframe-wise on |
That's my opinion. We could maybe do this with a deprecation cycle with keywords. |
2 thoughts here:
|
A proposal for the implementation would be:
e.g.
|
In #11603 (comment) (the main PR implementing the deferred API for rolling / expanding / ewm), we discussed how to specify table-wise
apply
s.Groupby.apply(f)
feeds the entire group (all columns) tof
. For backwards-compatibility,.rolling(n).apply(f)
needed to be column-wise.#11603 (comment) mentions a possible API like what I added for
.style
axis=0
: apply to each column independentlyaxis=1
: apply to each row independentlyaxis=None
: apply the supplied function to the entire tableSo it'd be
df.rolling(n).apply(f, axis=None)
.Do people like the axis=0 / 1 / None idiom? Is it obvious enough?
This is prompted by @josef-pkt's post on the mailinglist. Needing a rolling OLS.
An example:
For a concrete example, get the table-wise max (this is equivalent to
df.rolling(4).max().max(1)
)A real example is something like a rolling OLS:
The text was updated successfully, but these errors were encountered: