-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: cythonize vectorized string routines #16542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you can actually get even better perf by using c-functions and maybe even release the GIL (though this is a bit trickier code). |
xref to #4694 |
Yeah, looks like the cythonization isn't really what's helping in my example, it's the avoidance of na checks.
|
Now users have the option to use the Arrow-backed string dtype if they want better performance, it might not be needed to keep this issue open? |
I agree with Joris, closing as "supported via pyarrow" |
Not a night and day improvement since all we're doing is removing some python overhead, but there does seem to be 2x+ performance to be picked up. Possibly could use some of the template machinery to make these easy to write.
I wouldn't consider this high priority given long term plans to replace the string dtype, but could be worth it.
The text was updated successfully, but these errors were encountered: