Skip to content

New characteristic to cumsum #28127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
josephnowak opened this issue Aug 24, 2019 · 6 comments
Closed

New characteristic to cumsum #28127

josephnowak opened this issue Aug 24, 2019 · 6 comments
Labels
Enhancement Numeric Operations Arithmetic, Comparison, and Logical operations

Comments

@josephnowak
Copy link

Hi, I would like to know if is possible to add to the cumsum method the option to make an operation (multiplication of a constant for example or a lambda function) while it is making every sum to get something like this: https://stackoverflow.com/questions/34182755/cumulative-addition-multiplication-in-numpy, I know this could look a little unnecesarry but this kind of cumulative sum appears in a lot of formulas of economic indexs and I think there are other cases where this can be useful. The reason of this request is that when the data is very big use a normal python loop is not a nice option due to the time it take, or even using the solution of the link is not a good idea due to the precision of the floats number.

@jbrockmendel
Copy link
Member

Doing this with a general-case lambda I don't think we could implement anything any faster than the loop described in the stackoverflow link. For that particular use case, the answers given there look pretty well optimized.

Are there other cases you have in mind? e.g. some parametrized class of functions?

@josephnowak
Copy link
Author

josephnowak commented Aug 25, 2019

The solution in the link is a very fast solution the problem is for example if you are working with percetage, so in that case will lead to this "Please be mindful that for extremely huge input array cases if the b elements are such small fractions, because of cummulative operations, the initial numbers of b_rev_cumprod might come out as zeros resulting in NaNs in those initial places." that is the last part of the link which is exactly what happen to me, in fact I use a constant with a value of 0.03 and when I run this in a df with 5003 rows it has to calculate (0.03)^5003 which converge to NaN. I was thinking if there is possible to implement at least the for loop in cython to speeds up the calculus but if this is not possible there is no problem, Important: If the lambda is not an option should be useful pass another Series to simulate this ((0 + a[0]) * b[0]) + a[1] for example or a numpy array.

@WillAyd
Copy link
Member

WillAyd commented Aug 25, 2019

Is there a general purpose application like this that you may have come across in other languages? Might be a little too niche for pandas

@josephnowak
Copy link
Author

Hi, I think this is the kind of function that I'm looking for https://stackoverflow.com/questions/35004945/reduce-function-for-series, reduce produce a unique value but it work on pairs what I'm talking about is that it operate (row1, row2), (row2, row3), ..., and so on. This function could be very useful to pandas and to my actual problem I did this code to explain better:

s = pd.Series([[i, 3] for i in range(10)], index=range(10))
s.iloc[0] = [0, 0]
print(s.values)
print(reduce(lambda l1, row: l1 + [row[0] + l1[-1] * row[1]], s.values))

The result is going to be something like a cumulative sum but applying a multiplication

Of course this is not very useful in practice due to the concatenation it make in the lambda function, I think this option is not in pandas or numpy actually, if so, please tell me how it is called.

@jbrockmendel
Copy link
Member

#4567 might be related

@mroeschke mroeschke added Enhancement Numeric Operations Arithmetic, Comparison, and Logical operations labels Nov 2, 2019
@mroeschke
Copy link
Member

Agreed that this may be too niche for pandas as #4567 also discusses that this is out of scope for pandas. Thanks for the suggestion but closing since there's not enough vetted supported from the core devs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

No branches or pull requests

4 participants