-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Re-evaluate the minimum number of elements to use numexpr for elementwise ops #40500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
An example from the
Using the current MIN_ELEMENTS of 1e4:
Changing the MIN_ELEMENTS to 1e5 (which means that in this case, numpy will be used):
So here, the overhead is very clear, which is specifically true for ArrayManager which does the ops column-by-column, so paying the overhead for each column again. Using BlockManager, the above benchmark doesn't change, because it still uses numexpr (the whole block size is still above 1e5 elements). |
If these results are representative, id question the value of using numexpr at all. |
Following the same code used to create the plot in the OP:
These make numexpr look better than it did in the OP, though that might just be the log scale. |
Thanks for running it as well! Numbers from different environments are useful.
Yeah, indeed, the log scale was mainly to be able to show the difference on the smaller sizes (without log scale that wouldn't be visible). There is indeed still an advantage of numexpr (although the differences I see locally are smaller). But to conclude, your numbers support the same conclusion I think that 1e4 as threshold is too small, and it should be at least 1e5 or even 1e6. |
Results from my laptop (Core i7-10850H) are about the same
|
the original benchmarks for using numexpr were a number of years ago it's certainly possible that numpy has improved in the interim so +1 on raising the min elements |
OK, based on the numbers above, 1e6 seems a safer minimum than 1e5. Update the PR to reflect that: #40502 |
Is this good to close @jorisvandenbossche? |
The issue in #40502 appears to be in |
Yeah, I checked that at the time of doing the PR, and thought that 30MB won't be a big deal, but of course it's created and copied multiple times, .. (and also already created during test discovery and kept alive during the full test run), so underestimated the impact. Next attempt: #40609 |
Currently we have a MIN_ELEMENTS set at 10,000:
pandas/pandas/core/computation/expressions.py
Lines 42 to 43 in 00a6224
However, I have been noticing while running lots of performance comparisons recently, that numexpr still seems to show some overhead at that array size compared to numpy.
I did a few specific timings for a few ops comparing numpy and numexpr for a set of different array sizes:
Code used to create the plot
So in general, numexpr is not that much faster for the large arrays. But specifically, it still has a significant overhead compared to numpy up to 1e5 - 1e6, while the current minimum number of elements is 1e4.
Further, this might depend on your specific hardware and versions etc (this was run on my linux laptop with 8 cores, using latest versions of numpy and numexpr). So it is always hard to give a default suitable for all.
But based on the analysis above, I would propose raising the minimum from 1e4 to 1e5 (or maybe even 1e6).
The text was updated successfully, but these errors were encountered: