Skip to content

BUG: Augmented arithmetic assignments will modify original series starting version 1.1.4 #38519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
nikita-ivanov opened this issue Dec 16, 2020 · 18 comments · Fixed by #50981
Closed
2 of 3 tasks

Comments

@nikita-ivanov
Copy link

nikita-ivanov commented Dec 16, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

import pandas as pd

s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 -= 4
s

# output
# 0    1
# 1   -2
# 2   -1
# dtype: int64

s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 = s1 - 4
s

# output
# 0    1
# 1    2
# 2    3
# dtype: int64

Problem description

Starting with version 1.1.4 when applying either of +=, -=, *= or /= operators on a series s1 that was sliced from original series s, the change propagates to original series s. When using normal assignment operators s1 = s1 - 4 the problem is not present, which leads to inconsistent behavior.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 67a3d42
python : 3.7.9.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.18362
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None

pandas : 1.1.4
numpy : 1.17.2
pytz : 2020.4
dateutil : 2.8.0
pip : 20.3.3
setuptools : 51.0.0.post20201207
Cython : 0.28.5
pytest : 3.8.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.5.4
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@nikita-ivanov nikita-ivanov added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 16, 2020
@nikita-ivanov nikita-ivanov changed the title BUG: Subtract AND and Add AND operators will modify original series starting version 1.1.4 BUG: Augmented arithmetic assignments will modify original series starting version 1.1.4 Dec 16, 2020
simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this issue Dec 16, 2020
@simonjayhawkins
Copy link
Member

Thanks @nikita-ivanov for the report.

Starting with version 1.1.4 when applying either of +=, -=, *= or /= operators on a series s1 that was sliced from original series s, the change propagates to original series s.

This was changed in [d8c8cbb] REGR: inplace Series op not actually operating inplace (#37508) to fix a regression for DataFrame behaviour.

It looks like is also corrected a long standing bug/inconsistency with the series behaviour.

cc @jbrockmendel

@simonjayhawkins simonjayhawkins added Closing Candidate May be closeable, needs more eyeballs Copy / view semantics and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 16, 2020
@nikita-ivanov
Copy link
Author

@simonjayhawkins thanks for the update. Do I see it correctly that the future behavior would the same as it is now for -= (and similar operations)? More precisely,

s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 -= 4
s

# output
# 0    1
# 1   -2
# 2   -1
# dtype: int64

s = pd.Series([1, 2, 3])
s1 = s.iloc[1:]
s1 = s1 - 4
s

# output
# 0    1
# 1   -2
# 2   -1
# dtype: int64

@simonjayhawkins
Copy link
Member

for the s1 = s1 - 4 case, the expected output remains as

# 0    1
# 1    2
# 2    3
# dtype: int64

When using normal assignment operators s1 = s1 - 4 the problem is not present, which leads to inconsistent behavior.

in Python, the a = a+1 is not always equivalent to a+=1.

Under the hood, Python calls 'special methods' for the various syntactical constructs. in this case __add__ and __iadd__

if the object, in this case a, implements __iadd__, that will be called. In the case of mutable sequences, a will change in place. However, when a does not implement __iadd__, the expression a+=1 has the same effect as
a = a+1. The expression a+1 will be evaluated first, producing a new object, which is then bound to a. In other words, the object bound to a may or may not change, depending on whether a implements __iadd__

In general, for mutable sequences, it is common for __iadd__ to be implemented and that += happens inplace. For immutable sequences, clearly there is no way for that to happen.

A pandas Series is a mutable sequence that implements __iadd__

@nikita-ivanov
Copy link
Author

@simonjayhawkins thanks for clarifications. My concern is that prior to 1.1.4 the behavior was the same for both cases. And I did not find change of behavior information in the release notes. However, this change can cause some bugs (the case for me). I shall switch to s1 = s1 - 4, but it might be worthwhile to mention new behavior in the release notes.

@simonjayhawkins
Copy link
Member

@nikita-ivanov I agree that a long standing behaviour has been changed without adequate notice.

We will await comment from @jbrockmendel for the way forward here.

I am reluctant to add a regression tag here as I think the new behaviour is more correct and we are not currently planning any more releases in the 1.1.x series.

@jbrockmendel
Copy link
Member

I am reluctant to add a regression tag here as I think the new behaviour is more correct

Agreed.

@MarcoGorelli
Copy link
Member

I am reluctant to add a regression tag here as I think the new behaviour is more correct

Agreed.

So, should the release notes be clarified, or can this be closed? @jbrockmendel

@jbrockmendel
Copy link
Member

If someone wants to flesh out a release note i wont object

@MarcoGorelli MarcoGorelli added Docs good first issue and removed Closing Candidate May be closeable, needs more eyeballs labels Feb 14, 2021
@SophiaTangHart
Copy link

I'd like to take it and clarify on the release note. I'd like to let you know that this is my first time contributing.

@MarcoGorelli
Copy link
Member

Sure, go ahead - here's the contributing guide https://pandas.pydata.org/docs/dev/development/contributing_docstring.html , please do ask (here or on Slack) if anything's not clear

@SophiaTangHart
Copy link

@MarcoGorelli, Thank you. I'm reading the contributing guide.
I'm also new to Slack. How do I join Pandas Slack group? Do I refer to version 1.1.4 #38519 if I need clarification? Do I need to address certain people? Thanks!

@MarcoGorelli
Copy link
Member

Hey - for Slack: https://pandas.pydata.org/docs/dev/development/community.html?highlight=slack#community-slack

I'd suggest just rewording

- Fixed regression in inplace arithmetic operation on a Series not updating the parent DataFrame (:issue:`36373`)

, e.g. to

Fixed regression in inplace arithmetic operation (e.g. `+=`) on a Series not updating the parent DataFrame/Series

@SophiaTangHart
Copy link

@MarcoGorelli, thank you for directing me to join Slack. And thank you for your suggestion.

I'm not able to open

  • Fixed regression in inplace arithmetic operation on a Series not updating the parent DataFrame (:issue:36373)

to reword it. How can I access the docstring? Do I need to do it through my GitHut account? Thank you.

@MarcoGorelli
Copy link
Member

Do I need to do it through my GitHut account?

Yeah you'll need to fork the repo, clone it, checkout a new branch, stage, commit, push, open a pull request - if you read through https://pandas.pydata.org/docs/dev/development/contributing.html it should all be explained, else we're available on Slack to help out

@SophiaTangHart
Copy link

take

@kathleenhang
Copy link
Contributor

@SophiaTangHart Hi there, do you mind if I take the issue since it seems to be inactive for some time?

@SophiaTangHart SophiaTangHart removed their assignment Jan 25, 2023
@SophiaTangHart
Copy link

Sure. Sorry for taking a while. I've unassigned myself. Thanks.

@kathleenhang
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants