-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Implement numpy_helper functions directly in cython #18059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cleanup unused bits of src/datetime.pxd
Codecov Report
@@ Coverage Diff @@
## master #18059 +/- ##
==========================================
- Coverage 91.25% 91.23% -0.02%
==========================================
Files 163 163
Lines 50115 50115
==========================================
- Hits 45730 45721 -9
- Misses 4385 4394 +9
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #18059 +/- ##
==========================================
+ Coverage 91.25% 91.26% +<.01%
==========================================
Files 163 163
Lines 50124 50124
==========================================
+ Hits 45742 45745 +3
+ Misses 4382 4379 -3
Continue to review full report at Codecov.
|
Seem reasonable, but let's back that claim up with |
instead of this, is there a reason you simply don't move datetime.pxd into np_datetime.pxd? (certainly can do some of the code cleanup, but after) |
Will do. Even with affinity my asv results probably merit outside confirmation. |
That may be worth doing, but I see it as orthogonal to this. Cutting numpy_helper out of the equation simplifies things quite a bit (among other things I'm thinking about setup.py where 'include' and 'pxd' specifications seem pretty fragile) |
With appropriate skepticism:
Digging out the C code produced by cython for the implementation of
Whereas the version being replaced is:
cython is really verbose, but I pinged their mailing list and was told that the verbosity should be stripped out by the compiler. Same story for |
@chris-b1 can you review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actual changes look fine to me
pandas/_libs/tslibs/np_datetime.pxd
Outdated
# int64_t year | ||
# int32_t month, day, hour, min, sec, us, ps, as | ||
|
||
ctypedef enum NPY_DATETIMEUNIT: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enum needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's leftover from an earlier revision. Will remove.
# numpy object inspection | ||
|
||
cdef inline npy_datetime get_datetime64_value(object obj) nogil: | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the docstrings more descriptive? Something like "returns int64 value underlying numpy datetime scalar Python object"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, will change.
rebase as always on re-push |
…libs-npy_dtime5 update docstring per reviewer suggestion
Thoughts on this? I'd like to follow-up by moving a few more pieces of src/datetime.pxd over to np_datetime.pxd |
ping @jreback |
|
||
from tslibs.np_datetime cimport get_timedelta64_value, get_datetime64_value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this now makes lib depend on np_datetime.pxd, which is ok as tseries_depends now lists this.
thanks! you will prob need to rebase your other PR's on this just to make sure. |
cleanup unused bits of src/datetime.pxd
TL;DR this PR implements
get_datetime64_value
,get_timedelta64_value
, andget_datetime64_unit
directly in cython instead of getting them via C. There is no performance penalty. Weaning off of the src directory will make the build less of a hassle.AFAICT the main reason why
get_datetime64_unit
,get_datetime64_value
, andget_timedelta64_value
(as well asis_float_object
, is_datetime64_object`...) are implemented in numpy_helper.h and np_datetime.c is because cython/numpy does not make it easy to cimport the relevant symbols: https://github.com/pandas-dev/pandas/blob/master/pandas/_libs/src/numpy.pxd#L481After some wrangling and stackoverflowing, I found a way to implement these directly in cython. Importantly, the C code generated is functionally identical to the existing versions, so there shouldn't be a performance hit.
A big part of the diff comes from moving some declarations from np_datetime's pyx file to the pxd file.