-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Replaced void pointers with Types in JSON Datetime Conversions #30283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
} | ||
|
||
static void *PandasDateTimeStructToJSON(npy_datetimestruct *dts, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff is a little difficult to read here, but this got rid of the ToJSON
functions and instead replaces them with explicit ToIso
or ToEpoch
functions with strong types
*len = (size_t)get_datetime_iso_8601_strlen(0, base); | ||
char *result = PyObject_Malloc(*len); | ||
|
||
if (result == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would we get here? if len == 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would happen if malloc fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So failing to address it would mean getting something like a segfault? the PyErr_NoMemory just below raises back in py-space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would segfault if you tried to access the memory if malloc failed to allocate it. PyErr_NoMemory
would just set the global error indicator so when the extension exits that should raise back in the Python space (assuming nothing else clears it out)
FWIW the error handling here is copy / paste from here:
PyErr_NoMemory(); |
There are similar checks elsewhere throughout the module (not always consistent) so maybe worth unifying in a follow up
size_t *_outLen) { | ||
return PyUnicode_AsUTF8AndSize(_obj, _outLen); | ||
static char *PyUnicodeToUTF8(JSOBJ _obj, JSONTypeContext *tc, size_t *_outLen) { | ||
return (char *)PyUnicode_AsUTF8AndSize(_obj, (Py_ssize_t *)_outLen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyUnicode_AsUTF8AndSize came up in the thread about unicode surrogates. do we need to worry about those here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WillAyd i think this is the same function where passing a surrogate unicode character can cause a segfault. can that be ruled out here?
With usual caveat about my C-fu being weak, LGTM. cc @jreback |
thanks @WillAyd looks more clear |
This also incorporates the clean up from #30271
Benchmarks show the following benefit, though most likely just noise:
One follow up to this can drastically speed up datetime index handling - right now these are done in the Python space but can reuse these same functions