Skip to content

REGR: Fix inplace updates on column to set correct values #35936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 31, 2020

Conversation

jorisvandenbossche
Copy link
Member

Closes #35731

@jorisvandenbossche jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Aug 27, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.1.2 milestone Aug 27, 2020
@jbrockmendel
Copy link
Member

LGTM pending green

I'd also be OK with chasing the unwrapping up the call chain to unwrap earlier

@@ -1027,6 +1027,8 @@ def iset(self, loc: Union[int, slice, np.ndarray], value):
Set new item in-place. Does not consolidate. Adds new Block if not
contained in the current set of items
"""
if isinstance(value, ABCSeries):
value = value._values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_array to handle Index too?

@@ -1027,6 +1027,8 @@ def iset(self, loc: Union[int, slice, np.ndarray], value):
Set new item in-place. Does not consolidate. Adds new Block if not
contained in the current set of items
"""
if isinstance(value, ABCSeries):
value = value._values
Copy link
Member Author

@jorisvandenbossche jorisvandenbossche Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel I suppose ideally we ensure that iset always already gets the correct values, so never a Series. And so I could also unpack (or maybe use extract_array?) where iset is called in this case:

pandas/pandas/core/generic.py

Lines 3273 to 3278 in a1f6056

def _maybe_cache_changed(self, item, value) -> None:
"""
The object has called back to us saying maybe it has changed.
"""
loc = self._info_axis.get_loc(item)
self._mgr.iset(loc, value)

But: 1) I am not fully sure in which case this _maybe_cache_changed is used, and so value can also be something else as a Series (so DataFrame)? and 2) iset is being used in other places as well, where the same issue might occur

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re 2: I think iset is only called from 2 places in NDFrame. shouldnt be that hard to check unwrapping there, annotate iset

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, iset itself is indeed only additionally used in _iset_item, but that is then used in _set_item and in indexing code, ... (and checking in the indexing code what values can be passed is already a bit less trivial ;-))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I might do a follow-up to try to push the check earlier

def test_fillna_fill_other(self, data):
# inplace update doesn't work correctly with patched extension arrays
# extract_array returns PandasArray, while dtype is a numpy dtype
super().test_fillna_fill_other(data_missing)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange, but because I switched if isinstance(.., Series): value = value._value to extract_array, this test started to fail. That is because for test_numpy.py we patch the PandasArray to not be special cased, and then extract_array returns a PandasArray, even for a numpy dtype. And thus, in iset we set a PandasArray into a consolidated numpy block ..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah im not a fan of this patching, OK with xfailing

@jreback jreback added Categorical Categorical Data Type Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Aug 31, 2020
@jreback jreback merged commit b45327f into pandas-dev:master Aug 31, 2020
@jreback
Copy link
Contributor

jreback commented Aug 31, 2020

thanks @jorisvandenbossche

@jorisvandenbossche jorisvandenbossche deleted the regr-inplace-update branch August 31, 2020 12:39
@simonjayhawkins
Copy link
Member

@meeseeksdev backport 1.1.x

@lumberbot-app
Copy link

lumberbot-app bot commented Aug 31, 2020

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
$ git checkout 1.1.x
$ git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
$ git cherry-pick -m1 b45327f5fd262e835a1007587c2882c3d77b3158
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
$ git commit -am 'Backport PR #35936: REGR: Fix inplace updates on column to set correct values'
  1. Push to a named branch :
git push YOURFORK 1.1.x:auto-backport-of-pr-35936-on-1.1.x
  1. Create a PR against branch 1.1.x, I would have named this PR:

"Backport PR #35936 on branch 1.1.x"

And apply the correct labels and milestones.

Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon!

If these instruction are inaccurate, feel free to suggest an improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Series has no attribute "reshape" after adding a new category in df
4 participants