-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: releasing the GIL, #8882 #10199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What was the deal with |
apparently doing something
actually assigns a python object (and does inference and such), so that you MUST have the gil,
|
You can reuse NPY_NAN cross-platform, e.g., https://github.com/scikit-image/scikit-image/blob/master/skimage/feature/_texture.pyx#L11 |
@ahojnnes thxs, I did change it. |
73baaa0
to
b69c759
Compare
I updated this. works for all groupbys now. So it gives a general 5-10% speedup on most operations (even single threaded), as there is slightly less generated code (in so we still need to hold the gil for the There IS a soln where I can not use numpy resize at ALL. But its quite a bit more c-code and IMHO not really worth it. |
Cool. I'll try to run a few different groupby operations with dask.dataframe sometime early next week and report back. I'm curious what were the operations that needed to be released? Was there a core set of functionality or was it a large volume of trivial work? |
TL;DR. So lots of 'boilerplate' changes in However, the big speeds occur once I fixed the In effect, you are using a I 'fixed' this issue by adding some additional c-structures to hold data (basically I changed the low-level implementation); these can be easily passed around; downside is that are a little 'messy' in the code department. note to @shoyer I didn't actually bypass the realloc of memory (e.g. |
b6eea4b
to
4e2acd1
Compare
d1a08d7
to
bb78fef
Compare
I think that this is awesome and am afraid of it going stale. Is there something blocking it from being merged into master? |
@mrocklin We were waiting to get the 0.16.2 release out first, which happened last Friday. At this point I don't think there are any blockers. |
nothing blocking. I am going to update and see if I can hit some of #10213. I have to do a bit of odd code-generation because for example you want to simultaneously support |
actually I do need to add a doc-note to the whatsnew (and maybe a small mention in the enhancingperf section as well). |
Hrm, yes, unfortunate that you can't manipulate the lock directly |
Thought I'd caress this PR with a gentle ping. ping |
is that euphemism for harassment? |
Kind and loving harassment, yes. |
any commentary @jorisvandenbossche @shoyer @cpcloud @TomAugspurger going to have a followup for more GIL stuff. Its actually tricky making sure its doing (and you are measure exactly the right stuff) |
|
||
We are releasing the global-interpreter-lock (GIL) on some cython operations. | ||
This will allow other threads to run simultaneously during computation, potentially allowing performance improvements | ||
from multi-threading. Notably ``groupby`` and some indexing operations are a benefit to this. (:issue:`8882`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd say "benefit from this"
# compile time specilization of the fused types | ||
# as the cross-product is generated, but we cannot assign float->int | ||
# the types that don't pass are pruned | ||
if (vector_data is Int64VectorData and sixty_four_bit_scalar is int64_t) or ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpcloud I got this to work after puzzling on what cython does with multiple fused types. It creates the cross-product, but you generally need only certain specilizations; it does this beaut as a compile-time prune, so no perf hit.
PERF: releasing the GIL, #8882
Great job!! |
Woohoo! Pulls from master, checks benchmarks. |
@mrocklin Can we already look forward to a new awesome blogpost? :-) |
Best I have to offer at the moment are the slowly growing |
I should definitely check it out once I find some time! |
Wow! Great job! 👍🏻 |
- [x] tests added / passed - [x] passes ``git diff upstream/master | flake8 --diff`` Rebased version of #10229 which was [actually not](h ttps://github.com//pull/10229#issuecomment-131470116) fixed by #10199. Nothing particular relevant, just wanted to delete this branch locally and noticed it still applies: you'll judge what to do of it. Author: Pietro Battiston <[email protected]> Closes #13594 from toobaz/fix_checkunique and squashes the following commits: a63bd12 [Pietro Battiston] CLN: Initialization coincides with mapping, hence with uniqueness check
closes #8882
closes #10139
This is now implemented for all groupbys with the factorization part as well (which is actually a big part fo the time)
So seeing a nice curve as far as addtl cores are used (as compared to some of my previous posts).