-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Implement fast Cython Series iterator, for speeding up DataFrame.apply #309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey Wes - any way I can help on this? I just ran into this on my own, then came here and found your open issue. Some code that demonstrates the performance issue:
My use case is similar to the above example, so it'd be great to close the performance gap between |
Low-hanging fruit would be an option in apply that calls |
maybe like
|
What version of pandas are you using? I fixed a performance problem that was causing np.unique to be very slow |
I was on the latest version from PyPI. I just installed from github source and it looks much better:
Thanks! |
OK I made some further tweaks and things so
|
Thanks Wes! |
Having tons of calls to
Series.__new__
seriously degrades performance because most of the logic isn't necessary. Could play tricks in Cython with the data pointers to avoid this.The text was updated successfully, but these errors were encountered: