BUG: Fixed PandasArray.setitem with str #28119

TomAugspurger · 2019-08-23T19:08:53Z

jbrockmendel · 2019-08-23T19:16:20Z

LGTM

That said, I'm just noticing that below the changed code PandasArray can change its dtype and pin a new underlying ndarray. Does that seem sketchy to anyone else?

TomAugspurger · 2019-08-23T19:45:30Z

PandasArray can change its dtype and pin a new underlying ndarray.

I'm not too bothered by the new underlying ndarray part, since it's private. What part bothers you?

I do notice that .astype should probably be with copy=False.

jbrockmendel · 2019-08-23T20:23:58Z

I'm not too bothered by the new underlying ndarray part, since it's private. What part bothers you?

It's liable to cause surprises with view-like semantics. e.g. I'd expect parr[:] or np.asarray(parr) to stay in sync with parr. (this discussion probably belongs in a separate issue)

TomAugspurger · 2019-08-26T14:01:29Z

Agreed that this can go in a followup :)

WillAyd · 2019-08-27T22:02:29Z

pandas/core/arrays/numpy_.py

+            if is_object_dtype(self.dtype._dtype):
+                t = np.dtype(object)
+            else:
+                t = self.dtype._dtype


Do we have a test that hits this branch?

test_setitem_object_typecode[None] hits it (setting a string into an integer array).

simpler to leave the original code then just convert a np.str to no.object (which is what we do inside blocks manager and other places); maybe have a function to do this rather than rewriting logic all over the place

I don't think that's appropriate for PandasArray. The idea is to take an arbitrary numpy array and box it in an extension array.

and that’s exactly what is done in ObjectBlock now

pls refactor rather than adding logic

I doubt it. I think I was mimicking the behavior of Series.__setitem__

In [4]: x = np.array([1, 2, 3]) In [5]: s = pd.Series(x) In [6]: s.values is x Out[6]: True In [7]: s[0] = 'a' In [8]: s.values is x Out[8]: False

But I'm happy to be stricter here.

That said, we'll also inherit things like

In [11]: x = np.array([1, 2, 3]) In [12]: x[0] = 5.5 In [13]: x Out[13]: array([5, 2, 3])

But maybe that's OK, if the intent is to be close to NumPy here.

To what extent can we punt on the float treatment for now? I think there's a case to be made that we should raise instead of casting there, but don't want to bog this down any more.

Hmm I think our options are to always raise when the dtypes don't match, or adopt NumPy's behavior. I don't think I have a preference.

The thought that pushes me towards raising is that if/when this is backing a Block, we want Block.setitem to try to set it on block.values and then fall back to casting.

TomAugspurger · 2019-08-30T19:45:28Z

Merging in a few hours.

…em-object

pandas/tests/arrays/test_numpy.py

doc/source/whatsnew/v1.0.0.rst

jreback

lgtm. minor whatsnew comments, merge on green.

…em-object

TomAugspurger · 2019-09-17T20:21:18Z

Fixed the whatsnew. Merging.

* BUG: Fixed PandasArray.__setitem__ with str Closes pandas-dev#28118

TomAugspurger added this to the 1.0 milestone Aug 23, 2019

BUG: Fixed PandasArray.__setitem__ with str

f0718fa

Closes pandas-dev#28118

simonjayhawkins added ExtensionArray Extending pandas with custom dtypes or arrays. Bug labels Aug 23, 2019

jbrockmendel mentioned this pull request Aug 26, 2019

API/BUG: PandasArray __setitem__ can change underlying buffer #28150

Closed

WillAyd reviewed Aug 27, 2019

View reviewed changes

TomAugspurger added 2 commits September 9, 2019 15:32

no casting

75c2c58

Merge remote-tracking branch 'upstream/master' into PandasArray-setit…

f482d99

…em-object

jreback reviewed Sep 10, 2019

View reviewed changes

pandas/tests/arrays/test_numpy.py Show resolved Hide resolved

jreback reviewed Sep 10, 2019

View reviewed changes

doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved

jreback reviewed Sep 10, 2019

View reviewed changes

doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved

jreback approved these changes Sep 10, 2019

View reviewed changes

TomAugspurger added 3 commits September 10, 2019 15:17

Merge remote-tracking branch 'upstream/master' into PandasArray-setit…

6b8dfe2

…em-object

whatsnews

1828890

Merge remote-tracking branch 'upstream/master' into PandasArray-setit…

9d77af5

…em-object

TomAugspurger merged commit 5a227a4 into pandas-dev:master Sep 17, 2019

TomAugspurger deleted the PandasArray-setitem-object branch September 17, 2019 20:21

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

BUG: Fixed PandasArray.__setitem__ with str (pandas-dev#28119)

caea50e

* BUG: Fixed PandasArray.__setitem__ with str Closes pandas-dev#28118

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

BUG: Fixed PandasArray.__setitem__ with str (pandas-dev#28119)

4e94286

* BUG: Fixed PandasArray.__setitem__ with str Closes pandas-dev#28118

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Fixed PandasArray.setitem with str #28119

BUG: Fixed PandasArray.setitem with str #28119

TomAugspurger commented Aug 23, 2019 •

edited

Loading

jbrockmendel commented Aug 23, 2019

TomAugspurger commented Aug 23, 2019

jbrockmendel commented Aug 23, 2019

TomAugspurger commented Aug 26, 2019

WillAyd Aug 27, 2019

TomAugspurger Aug 30, 2019

jreback Aug 30, 2019

TomAugspurger Aug 30, 2019

jreback Aug 30, 2019

TomAugspurger Sep 9, 2019

TomAugspurger Sep 9, 2019

jbrockmendel Sep 9, 2019

TomAugspurger Sep 10, 2019 •

edited

Loading

jbrockmendel Sep 10, 2019

TomAugspurger commented Aug 30, 2019

jreback left a comment

TomAugspurger commented Sep 17, 2019

BUG: Fixed PandasArray.__setitem__ with str #28119

BUG: Fixed PandasArray.__setitem__ with str #28119

Conversation

TomAugspurger commented Aug 23, 2019 • edited Loading

jbrockmendel commented Aug 23, 2019

TomAugspurger commented Aug 23, 2019

jbrockmendel commented Aug 23, 2019

TomAugspurger commented Aug 26, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger Sep 10, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TomAugspurger commented Aug 30, 2019

jreback left a comment

Choose a reason for hiding this comment

TomAugspurger commented Sep 17, 2019

BUG: Fixed PandasArray.setitem with str #28119

BUG: Fixed PandasArray.setitem with str #28119

TomAugspurger commented Aug 23, 2019 •

edited

Loading

TomAugspurger Sep 10, 2019 •

edited

Loading