-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: separate out groupby algos #15775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jreback
commented
Mar 21, 2017
- separate out groupby algorithms to separate lib
- release GIL on median
- release GIL on is_lexsorted / fix memory leak
- release GIL on nancorr
pandas/_libs/algos.pyx
Outdated
Py_ssize_t n, nlevels | ||
int64_t k, cur, pre | ||
ndarray arr | ||
bint result = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not consistent on this, but I think using True
/False
on bint is clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can't do that inside a nogil though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work, although may not in all versions?
%%cython
def f():
cdef bint result = True
with nogil:
result = False
return result
f()
Out[32]: False
cython.__version__
Out[33]: '0.24.1'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes as long as you don't return
it directly. I made the change anyhow
pandas/_libs/algos.pyx
Outdated
n = len(list_of_arrays[0]) | ||
|
||
cdef int64_t **vecs = <int64_t**> malloc(nlevels * sizeof(int64_t*)) | ||
for i from 0 <= i < nlevels: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since moving all this could take the opportunity to update loops using this old syntax to range(...)
, though not a big deal
@@ -0,0 +1,291 @@ | |||
# cython: profile=False | |||
|
|||
from numpy cimport * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't need the import *
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, that's a pretty standard import.
release GIL on is_lexsorted / fix memory leak release GIL on nancorr
Codecov Report
@@ Coverage Diff @@
## master #15775 +/- ##
==========================================
- Coverage 91.01% 91% -0.02%
==========================================
Files 143 143
Lines 49403 49403
==========================================
- Hits 44966 44957 -9
- Misses 4437 4446 +9
Continue to review full report at Codecov.
|
- separate out groupby algorithms to separate lib - release GIL on median - release GIL on is_lexsorted / fix memory leak - release GIL on nancorr Author: Jeff Reback <[email protected]> Closes pandas-dev#15775 from jreback/groupby and squashes the following commits: 4e2bfec [Jeff Reback] release GIL on median release GIL on is_lexsorted / fix memory leak release GIL on nancorr ce28bb5 [Jeff Reback] CLN: separate out groupby algorithms to separate lib