REGR: Fix conversion of mixed dtype DataFrame to numpy str #35473

dsaxton · 2020-07-30T01:20:16Z

closes REGR: DataFrame.to_numpy(dtype=str) raises RuntimeError in pandas 1.1.0 #35455
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

pandas/core/frame.py

jreback · 2020-07-30T01:38:24Z

pandas/core/internals/managers.py

@@ -847,7 +847,7 @@ def _interleave(self, dtype=None, na_value=lib.no_default) -> np.ndarray:
        # Give EAs some input on what happens here. Sparse needs this.
        if isinstance(dtype, SparseDtype):
            dtype = dtype.subtype
-        elif is_extension_array_dtype(dtype):
+        elif is_extension_array_dtype(dtype) or is_dtype_equal(dtype, str):


are you sure this is actually hit?

Yes, this is to avoid initializing result as np.empty(dtype=str) which was creating an array with dtype "<U1" and then breaking things.

Before the change that caused this regression the dtype was always being set to the inferred dtype from _interleaved_dtype (object in this case), so here I'm trying to make sure that this still happens.

make this a separate elif as it is very confusing here the way it is written

jreback

looks ok, pls merge master and a comment, ping on green.

jreback · 2020-08-06T23:21:40Z

pandas/core/internals/managers.py

@@ -847,7 +847,7 @@ def _interleave(self, dtype=None, na_value=lib.no_default) -> np.ndarray:
        # Give EAs some input on what happens here. Sparse needs this.
        if isinstance(dtype, SparseDtype):
            dtype = dtype.subtype
-        elif is_extension_array_dtype(dtype):
+        elif is_extension_array_dtype(dtype) or is_dtype_equal(dtype, str):


make this a separate elif as it is very confusing here the way it is written

jreback · 2020-08-07T22:18:03Z

lgtm. ping on green.

dsaxton · 2020-08-07T23:16:15Z

lgtm. ping on green.

@jreback Green, thanks for reviewing

jreback · 2020-08-07T23:17:24Z

thanks!

…aFrame to numpy str

…numpy str (#35617) Co-authored-by: Daniel Saxton <[email protected]>

dsaxton added 2 commits July 29, 2020 19:55

Handle str better

661c308

Doc and test

73db693

jreback requested changes Jul 30, 2020

View reviewed changes

simonjayhawkins added this to the 1.1.1 milestone Jul 30, 2020

simonjayhawkins added Compat pandas objects compatability with Numpy or Python functions Dtype Conversions Unexpected or buggy dtype conversions labels Jul 30, 2020

jreback requested changes Aug 6, 2020

View reviewed changes

dsaxton added 3 commits August 7, 2020 16:44

Make an elif

6f45bca

Merge remote-tracking branch 'upstream/master' into to_numpy-regr

726dc83

Add back import

61beac7

jreback approved these changes Aug 7, 2020

View reviewed changes

jreback merged commit 47c17cb into pandas-dev:master Aug 7, 2020

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Aug 7, 2020

Backport PR pandas-dev#35473: REGR: Fix conversion of mixed dtype Dat…

b58116a

…aFrame to numpy str

meeseeksmachine mentioned this pull request Aug 7, 2020

Backport PR #35473 on branch 1.1.x (REGR: Fix conversion of mixed dtype DataFrame to numpy str) #35617

Merged

dsaxton deleted the to_numpy-regr branch August 8, 2020 01:14

simonjayhawkins mentioned this pull request Aug 8, 2020

REGR: DataFrame.to_numpy(dtype=str) raises RuntimeError in pandas 1.1.0 #35455

Closed

1 task

simonjayhawkins pushed a commit that referenced this pull request Aug 8, 2020

Backport PR #35473: REGR: Fix conversion of mixed dtype DataFrame to …

c5ded4d

…numpy str (#35617) Co-authored-by: Daniel Saxton <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

REGR: Fix conversion of mixed dtype DataFrame to numpy str #35473

REGR: Fix conversion of mixed dtype DataFrame to numpy str #35473

Uh oh!

dsaxton commented Jul 30, 2020

Uh oh!

Uh oh!

jreback Jul 30, 2020

Uh oh!

dsaxton Jul 30, 2020

Uh oh!

jreback Aug 6, 2020

Uh oh!

jreback left a comment

Uh oh!

jreback Aug 6, 2020

Uh oh!

jreback commented Aug 7, 2020

Uh oh!

dsaxton commented Aug 7, 2020

Uh oh!

jreback commented Aug 7, 2020

Uh oh!

Uh oh!

Uh oh!

REGR: Fix conversion of mixed dtype DataFrame to numpy str #35473

REGR: Fix conversion of mixed dtype DataFrame to numpy str #35473

Uh oh!

Conversation

dsaxton commented Jul 30, 2020

Uh oh!

Uh oh!

jreback Jul 30, 2020

Choose a reason for hiding this comment

Uh oh!

dsaxton Jul 30, 2020

Choose a reason for hiding this comment

Uh oh!

jreback Aug 6, 2020

Choose a reason for hiding this comment

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

jreback Aug 6, 2020

Choose a reason for hiding this comment

Uh oh!

jreback commented Aug 7, 2020

Uh oh!

dsaxton commented Aug 7, 2020

Uh oh!

jreback commented Aug 7, 2020

Uh oh!

Uh oh!