-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: de-duplicate code for pad/backfill #40306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
looks good, +1 on this general idea to unify things. |
|
||
def _datetimelike_compat(func: F) -> F: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep datetimelike_compat and apply it to interpolate_2d (i learned in #40294 that ATM interpolate_2d doesnt need it ATM bc it the wrapper is used in the functions that this PR removes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can do. although as stated recently from my experience with experimenting with numba, these wrappers would probably be best in _libs adjacent to the cython code to give a cleaner api and remove conversions needed for the cython code from the python code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im looking forward to seeing this, increasingly positive on the idea of doing more numba and less cython
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slow down. not discussed this yet. My primary motivations are no_arch releases, fully typed libraries without stubs and utilization of more cpu cores and maybe gpu with standard pandas (AFAIK cudf/rapids are planning to rely on wsl2 in order to support Windows and not support Windows natively).
I've only just started, so don't know if this is a realistic goal or whether a more incremental approach of using numba when installed for parallelization is more feasible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i reserve the right to be enthused about this idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure. i'm enthusiastic too at this early stage...
- Inline typing compatible with mypy
- Faster local development cycle
- Simpler development environment for new contributors
- No wheel building, maybe faster release cycles
- Maybe can add some basic gpu support in future
- Benefits of numba
- Less complex code
- No need for some prep functions, casting, asi8, ensure_int/float etc
- Style checks apply to library code.
- Let numba handle types and type conversions
- Let numba handle memory allocation and deallocation
for the basic functions i've looked at so far, the performance of numba seems to be comparable to cython when comparing to a pure python implementation (but i've not touched anything yet where we have C classes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh numba perforance can often be much better than cython, and usually not less than. The issue we have is that we cannot move much of the core functions as we do not have a strict dependency at this point. we can certainly revisit things going forward.
@simonjayhawkins does the slowdown come from the cython changes (removing the 1D version) or from other changes? |
not profiled yet. (the slowdown is neglible compared to the object conversion that was done up to a few days ago!) I will profile shortly. but if due to the cython code, could consolidate around the 1d versions and use np.along_axis for 2d instead. I consolidated around the 2d here since I think that can be parallelized with numba, but that shouldn't be a consideration here. there was no performance impact upto the last commit so may split those changes into a separate PR to keep things manageable. I could be that some extra checking in interploated_2d is responsible for the slowdown. I found that it impacts performance calling pd.isna on the mask after transformations compared to before transformations. |
Yes, but the baseline for comparison should probably not be the slow object conversion way from before, but comparing with numpy numeric dtypes (since it's the goal to replace eg float64 with Float64) |
agreed. I don't expect this PR to be merged/acceptable with the performance degradation. when changing to not do the object conversion, we always update the mask in the 1d version and this will also be needed in the 2d version if masked arrays are to be backed by 2d arrays unless we make the mask update conditional or change that strategy. |
closing for now to clear queue. will revisit after #38992. |
while experimenting with numba, #40530, i'm periodically running the benchmark suite. I don't know how best to do this as some benchmarks show a significant degradation that are likely due to compilation as testing the functions in jupyter notebook tend to show similar performance to Cython. However, the benchmarks that show a performance improvement likely indicate areas where improvements to the cython code could be investigated. This PR is part of that process. Below is the benchmarks that are improved with #40530 (as of yesterday). As I make progress with #40530. This output will change and the priorities will undoubtedly change.
|
if the general idea of this PR is acceptable, could try to improve these but these were already significantly improved in #39953.
with the cython functions reduced from 4 to 1, it is feasible to add 1 dedicated function for no limit (my numba branch is quicker than cython with that change) instead of 4 extra functions.
also may pave the way to add limit area and limit direction support to EAs
and maybe also for 2D masked arrays? cc @jbrockmendel
also having 1 cython function, I think also paves the way for a cython version for the limit area to remove the python set logic could be investigated.