-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REGR: to_dict(orient={"records", "split", "dict"}) slowdown #42486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pls this is a very hot path and don't want to affect other things |
@@ -196,6 +193,8 @@ def maybe_box_native(value: Scalar) -> Scalar: | |||
value = int(value) # type: ignore[arg-type] | |||
elif is_bool(value): | |||
value = bool(value) | |||
elif isinstance(value, (np.datetime64, np.timedelta64)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT this condition should be equivalent to previous. There was this comment and followup https://github.com/pandas-dev/pandas/pull/37648/files#r576461677, but the tests seem to contradict by mapping datetime -> datetime (behavior which isn't changed here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont you need datetime
, timedelta
here as well (e.g. python scalar types)? or i suppose these are already boxed right so we don't need them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep exactly, these are already boxed correctly (and we look to have some test coverage for this, for example in pandas.tests.frame.methods.test_to_dict::TestDataFrameToDict::test_to_dict_tz
)
EDIT: The function modified is
|
no run all the asvs or at the very least casting and construction ones |
Sorry, please see comment above, just edited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk perf looks fine (i see we dont' use this routine except where you are testing)
@@ -196,6 +193,8 @@ def maybe_box_native(value: Scalar) -> Scalar: | |||
value = int(value) # type: ignore[arg-type] | |||
elif is_bool(value): | |||
value = bool(value) | |||
elif isinstance(value, (np.datetime64, np.timedelta64)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont you need datetime
, timedelta
here as well (e.g. python scalar types)? or i suppose these are already boxed right so we don't need them?
thanks @mzeitlin11 |
…", "dict"}) slowdown
@meeseeksdev backport 1.3.x |
Something went wrong ... Please have a look at my logs. |
…) slowdown (#42502) Co-authored-by: Matthew Zeitlin <[email protected]>
Regression was due to each scalar value hitting
is_datetime_or_timedelta_dtype
->_is_dtype_type
where it falls most of the way through.