-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
New characteristic to cumsum #28127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Doing this with a general-case lambda I don't think we could implement anything any faster than the loop described in the stackoverflow link. For that particular use case, the answers given there look pretty well optimized. Are there other cases you have in mind? e.g. some parametrized class of functions? |
The solution in the link is a very fast solution the problem is for example if you are working with percetage, so in that case will lead to this "Please be mindful that for extremely huge input array cases if the b elements are such small fractions, because of cummulative operations, the initial numbers of b_rev_cumprod might come out as zeros resulting in NaNs in those initial places." that is the last part of the link which is exactly what happen to me, in fact I use a constant with a value of 0.03 and when I run this in a df with 5003 rows it has to calculate (0.03)^5003 which converge to NaN. I was thinking if there is possible to implement at least the for loop in cython to speeds up the calculus but if this is not possible there is no problem, Important: If the lambda is not an option should be useful pass another Series to simulate this ((0 + a[0]) * b[0]) + a[1] for example or a numpy array. |
Is there a general purpose application like this that you may have come across in other languages? Might be a little too niche for pandas |
Hi, I think this is the kind of function that I'm looking for https://stackoverflow.com/questions/35004945/reduce-function-for-series, reduce produce a unique value but it work on pairs what I'm talking about is that it operate (row1, row2), (row2, row3), ..., and so on. This function could be very useful to pandas and to my actual problem I did this code to explain better: s = pd.Series([[i, 3] for i in range(10)], index=range(10)) The result is going to be something like a cumulative sum but applying a multiplication Of course this is not very useful in practice due to the concatenation it make in the lambda function, I think this option is not in pandas or numpy actually, if so, please tell me how it is called. |
#4567 might be related |
Agreed that this may be too niche for pandas as #4567 also discusses that this is out of scope for pandas. Thanks for the suggestion but closing since there's not enough vetted supported from the core devs |
Hi, I would like to know if is possible to add to the cumsum method the option to make an operation (multiplication of a constant for example or a lambda function) while it is making every sum to get something like this: https://stackoverflow.com/questions/34182755/cumulative-addition-multiplication-in-numpy, I know this could look a little unnecesarry but this kind of cumulative sum appears in a lot of formulas of economic indexs and I think there are other cases where this can be useful. The reason of this request is that when the data is very big use a normal python loop is not a nice option due to the time it take, or even using the solution of the link is not a good idea due to the precision of the floats number.
The text was updated successfully, but these errors were encountered: