Skip to content

ENH: Allow inplace arithmetic operations #5104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jtratner opened this issue Oct 4, 2013 · 15 comments · Fixed by #8520
Closed

ENH: Allow inplace arithmetic operations #5104

jtratner opened this issue Oct 4, 2013 · 15 comments · Fixed by #8520
Labels
Internals Related to non-user accessible pandas implementation Numeric Operations Arithmetic, Comparison, and Logical operations
Milestone

Comments

@jtratner
Copy link
Contributor

jtratner commented Oct 4, 2013

In current master (and before the initial arithmetic refactor), __iadd__ doesn't do anything, it's just a synonym for __add__ (or not defined). (just for reference, v0.12.0 had __iadd__ = __add__ on Series and no __add__ defined on DataFrame). It would be great to support in place ops on += and friends. Is this possible to do?

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

with dtype conversion....seems ok?

In [19]: s = Series([1,2,3])

In [20]: s += 1.5

In [21]: s
Out[21]: 
0    2.5
1    3.5
2    4.5
dtype: float64

@jtratner
Copy link
Contributor Author

jtratner commented Oct 4, 2013

It works but it's not actually in place (with += python uses __add__ if __iadd__ isn't defined):

In [1]: import pandas

In [2]: from pandas import *

In [3]: s = Series([1, 2, 3])

In [4]: s2 = s

In [5]: s += 1.5

In [6]: s
Out[6]:
0    2.5
1    3.5
2    4.5
dtype: float64

In [7]: s2
Out[7]:
0    1
1    2
2    3
dtype: int64

In [8]:

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

oh...i c....prob don't have tests for that I thought was ok (did it work before?)

@jtratner
Copy link
Contributor Author

jtratner commented Oct 4, 2013

as I said, hasn't worked that way since at least v0.12, not sure if earlier.

@jtratner
Copy link
Contributor Author

jtratner commented Oct 4, 2013

I'll take a look

@jtratner
Copy link
Contributor Author

jtratner commented Oct 4, 2013

Looked back, hasn't (truly) supported inplace ops since at least 0.9 (at least not for DataFrame, and Series appears to have overridden inplace ops)

@jreback
Copy link
Contributor

jreback commented Oct 4, 2013

ok....should be straightforward in any event

@BrenBarn
Copy link

BrenBarn commented Nov 8, 2013

Would also be nice to have in-place versions of div, add, etc., with the axis argument allowing in-place operations on any axis.

@jtratner
Copy link
Contributor Author

@BrenBarn even if you had inplace ops, pandas isn't actually going to do it inplace (i.e., it'll still allocate new memory for the final arrays). It would be relatively easy to add an inplace keyword argument, because we now have this _update_inplace method.

My problem is that it's a total lie to the end user: you think that you're doing an operation inplace and that this will mean you are saving on memory, but all you would be doing is a convenience function that just updates the wrapper.

@BrenBarn
Copy link

I see. So you mean it's simply not possible to update Pandas structures in-place at all? That is unfortunate.

@jreback
Copy link
Contributor

jreback commented Nov 11, 2013

@BrenBarn the structures CAN be updated in some cases in-place (a lot depends on what kind of operation it is), but for example, the operation would have to be 2 structures that are alignable w/o copy, and the operation itself would then be done in-place. e.g. arithmetic is straightforward. Any type of reshape or dtype change would invalidate this.

Would welcome some tests for this; that is the main issue. I don't think making is work all that hard.

@BrenBarn
Copy link

But this whole issue is about arithmetic operations, right? I'm not saying every possible manipulation should be available in-place, just basic ones that can already be done in-place on numpy arrays (e.g., add, div, etc.). Probably there could be a check to see that the data have the same indexes and same dtype and an error raised if the operation can't succeed.

@jreback
Copy link
Contributor

jreback commented Nov 11, 2013

@BrenBarn this could be done when the op doesn't change dtype, however, I think this might actually be more confusing, because their would be a subset of ops that are actually in-place, while most would not actually modify the underlying data, but replace it (as happens now in all cases). This would be deterministic, but IMHO confusing to the user.

A simple case of when its not possible is changes, e.g.

In [1]: s = Series([1,2,3])

In [2]: id(s.values)
Out[2]: 58659888

In [3]: s += 1.5

In [4]: id(s.values)
Out[4]: 58634416

Numpy keeps the same object, but just does a wrong thing IMHO

In [5]: x = np.array([1,2,3])

In [10]: id(x)
Out[10]: 61601360

In [11]: x += 1.5

In [12]: id(x)
Out[12]: 61601360

In [13]: x
Out[13]: array([2, 3, 4])

@jtratner
Copy link
Contributor Author

just to echo @jreback - I did not mean it's impossible to update inplace,
it's just that generally because of pandas' flexibility a number of ops
cause copies.

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Mar 28, 2014
@shoyer
Copy link
Member

shoyer commented Aug 9, 2014

In-place operations have unfortunate complications when coupled with automatic index-based alignment, as I discovered when I tried to implement this for xray:
pydata/xarray#184

The problem is that you can end up with (int, int) operations giving you complete garbage if the second argument needs to be realigned and hence ends up with a bunch of NaN values.

Numpy does not catch in-place integer operations with non-scalar NaN values (even though it probably should raise)... so you can end up with complete garage, not just bad rounding:

>>> x = np.array([0])
>>> x += np.array([np.nan])
>>> x
array([-9223372036854775808])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants