Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ENH: Add numba engine for rolling apply #30151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add numba engine for rolling apply #30151
Changes from 40 commits
3b9bff8
9a302bf
0e9a600
36a77ed
dbb2a9b
f0e9a4d
1250aee
4e7fd1a
cb976cf
45420bb
17851cf
20767ca
9619f8d
66fa69c
b8908ea
135f2ad
34a5687
6da8199
123f77e
54e74d1
04d3530
4bbf587
f849bc7
0c30e48
c4c952e
8645976
987c916
b775684
2e04e60
9b20ff5
0c14033
c7106dc
1640085
2846faf
5a645c0
6bac000
6f1c73f
a890337
0a9071c
9d8d40b
84c3491
a429206
5826ad9
cf7571b
4bc9787
18eed60
f715b55
6a765bf
af3fe50
eb7b5e1
f7dfcf4
a42a960
d019830
29d145f
248149c
a3da51e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you break this into 2 sentences, its hard to grok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is pre-JIT a term? I would say if its just a JITTED function its ok; can you add a numba refernce to what a JIT function is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
show the defaults for these kwargs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the data set is larger" here is pretty vague. is the perf gain a function of the array size or more about the user-defined function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat both, but more obvious with the data size. I can make this more specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an X for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok with this being non-private (e.g. generate_numba_apply_func), unless its only used in this module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does
Callable
need to return anndarray
? Adding sub-types toCallable
helps immensely so would be preferable to add if soThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Callable
here should take in an ndarray and potentially*args
and return a scalar. What would be the best way to type that?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not sure about the *args piece but here are the docs for it:
https://docs.python.org/3/library/typing.html#typing.Callable
So I would think
Callable[[np.ndarray, ...], Scalar]
if it works otherwiseCallable[..., Scalar]
is the best we can do.Scalar can be imported from
pandas._typing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'll give that a shot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to go with
Callable[..., Scalar]
as also specifyingnp.ndarray
was raising errors.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly more explicit would be preferable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add subtypes here? Is this
Dict[Callable, ???]
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just
as you set the defaults below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move make_rolling_apply to module scope (out of this function)?
also don't you need to actually assign to the cache? (on a miss)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache assignment happens here https://github.com/pandas-dev/pandas/pull/30151/files#diff-0de5c5d9abfcdd141e83701eaaec4358R541 (the function needs to run on some data first)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then i wouldn’t even check the cache here ; that must be at the higher level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay sure thing. I'll move it up then.