-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Reduce Loosely Typed Functions in objToJSON #29954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FWIW the reduction of anonymous functions here gives a little boost to performance before after ratio
[6ffbc4ef] [86a575f2]
<master> <consistent-json>
- 140±3ms 121±3ms 0.87 io.json.ToJSONLines.time_delta_int_tstamp_lines
- 90.3±2ms 73.1±3ms 0.81 io.json.ToJSON.time_to_json('split', 'df_td_int_ts')
- 80.9±2ms 63.6±2ms 0.79 io.json.ToJSON.time_to_json('values', 'df_td_int_ts')
- 105±2ms 82.0±2ms 0.78 io.json.ToJSON.time_to_json('records', 'df_td_int_ts')
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED. |
tentatively looks nice; will take a look in the AM |
@@ -571,7 +537,6 @@ static int NpyTypeToJSONType(PyObject *obj, JSONTypeContext *tc, int npyType, | |||
return JT_NULL; | |||
} | |||
GET_TC(tc)->doubleValue = (double)doubleVal; | |||
GET_TC(tc)->PyTypeToJSON = CDouble; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure i understatnd, Foo._PyTypeToJSON are the callbacks you're referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that's right - anything getting assigned to PyTypeToJSON
(side note - not technically callbacks but void pointer functions; edited appropriately)
In the end we might be able to get rid of the custom PyEncoder
altogether and just use more built-in ujson functionality by assigning the native types and returning JT_<TYPE>
where appropriate
GET_TC(tc)->longValue = (JSINT64)longVal; | ||
GET_TC(tc)->PyTypeToJSON = NpyDatetime64ToJSON; | ||
return ((PyObjectEncoder *)tc->encoder)->datetimeIso ? JT_UTF8 | ||
: JT_LONG; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not so familiar with this syntax. is the ?:; here equivalent to the if/else you've changed this to?
There is still GET_TC(tc)->PyTypeToJSON = NpyDatetime64ToJSON
on 559. You're not getting rid of this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is the ternary operator, so basically <EXPRESSION> ? <VALUE_IF_TRUE> : <VALUE_IF_FALSE>
This is used elsewhere in the module; just copying over
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i get it, thanks
pending green (Azure fail looks unrelated) LGTM, but someone with stronger C-fu should take a look. @chris-b1 ? |
lgtm. will wait for @chris-b1 look. |
Merging to continue on this one. Happy to address any comments that come up later |
Part of my perpetual goal to simplify this. Right now we override the built-in capabilities of the JSON encoder and use multiple layers of loosely typed functions. I think this is pretty confusing, so I am trying to reduce the callbacks and set native values directly during iteration
This isn't a full solve - rather just numeric values. I plan on tackling string representations in a follow up; just stopping here because the code movement gets large fast.
In addition to less code, note that this also reduces the amount of compiler warnings from 68 down to 60 when compiling with
-Wextra
master:
This PR