-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: avoid values_from_object in reshape.merge #32537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: avoid values_from_object in reshape.merge #32537
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check perf?
pandas/core/reshape/merge.py
Outdated
and is_extension_array_dtype(rk.dtype) | ||
and lk.dtype == rk.dtype | ||
): | ||
elif is_extension_array_dtype(lk.dtype) and lk.dtype == rk.dtype: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this guaranteed to be equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure. How could you have this condition but not is_extension_array_dtype(rk.dtype)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should change to lk.is_dtype_equal(rk) (technically the lk.dtype == rk.dtype is unsafe and we should never use this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will update
lk = getattr(lk, "_values", lk)._data | ||
rk = getattr(rk, "_values", rk)._data | ||
# Extract the ndarray (UTC-localized) values | ||
# Note: we dont need the dtypes to match, as these can still be compared |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does happen somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, in the EA-dtype check below we require matching dtypes
…-values_from_object-merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am thinking this actually enables some cases? though we do have pretty comprehensive testing here.
comment
pandas/core/reshape/merge.py
Outdated
and is_extension_array_dtype(rk.dtype) | ||
and lk.dtype == rk.dtype | ||
): | ||
elif is_extension_array_dtype(lk.dtype) and lk.dtype == rk.dtype: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should change to lk.is_dtype_equal(rk) (technically the lk.dtype == rk.dtype is unsafe and we should never use this)
not sure i follow. can you give an example? |
…-values_from_object-merge
are there any merge cases that work that didn't previously? |
there are some dtype combinations that arent tested that looked sketchy. there's a TODO comment about adding tests for them |
kk |
lgtm. merge o ngreen. |
hmm, something broke |
…-values_from_object-merge
updated+green |
Thanks @jbrockmendel |
No description provided.