-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REGR: Fix conversion of mixed dtype DataFrame to numpy str #35473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/internals/managers.py
Outdated
@@ -847,7 +847,7 @@ def _interleave(self, dtype=None, na_value=lib.no_default) -> np.ndarray: | |||
# Give EAs some input on what happens here. Sparse needs this. | |||
if isinstance(dtype, SparseDtype): | |||
dtype = dtype.subtype | |||
elif is_extension_array_dtype(dtype): | |||
elif is_extension_array_dtype(dtype) or is_dtype_equal(dtype, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure this is actually hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is to avoid initializing result as np.empty(dtype=str) which was creating an array with dtype "<U1" and then breaking things.
Before the change that caused this regression the dtype was always being set to the inferred dtype from _interleaved_dtype (object in this case), so here I'm trying to make sure that this still happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a separate elif as it is very confusing here the way it is written
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks ok, pls merge master and a comment, ping on green.
pandas/core/internals/managers.py
Outdated
@@ -847,7 +847,7 @@ def _interleave(self, dtype=None, na_value=lib.no_default) -> np.ndarray: | |||
# Give EAs some input on what happens here. Sparse needs this. | |||
if isinstance(dtype, SparseDtype): | |||
dtype = dtype.subtype | |||
elif is_extension_array_dtype(dtype): | |||
elif is_extension_array_dtype(dtype) or is_dtype_equal(dtype, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a separate elif as it is very confusing here the way it is written
lgtm. ping on green. |
@jreback Green, thanks for reviewing |
thanks! |
…aFrame to numpy str
…numpy str (#35617) Co-authored-by: Daniel Saxton <[email protected]>
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff