-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Return sparse objects always for cumsum #14771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
""" | ||
nv.validate_cumsum(args, kwargs) | ||
|
||
# TODO: gh-12855 - return a SparseArray here | ||
if notnull(self.fill_value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use self._null_fill_value
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
Returns | ||
------- | ||
cumsum : SparseSeries if `self` has a null `fill_value` and a | ||
generic Series otherwise | ||
cumsum : SparseSeries | ||
""" | ||
nv.validate_cumsum(args, kwargs) | ||
new_array = SparseArray.cumsum(self.values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because self.values
is SparseArray
, this wrapping looks unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Done.
22049b4
to
59fd828
Compare
59fd828
to
b2af5ee
Compare
b2af5ee
to
6ab170f
Compare
Current coverage is 85.26% (diff: 100%)@@ master #14771 diff @@
==========================================
Files 144 144
Lines 50981 50983 +2
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43470 43472 +2
Misses 7511 7511
Partials 0 0
|
6ab170f
to
39eb909
Compare
|
||
Parameters | ||
---------- | ||
axis : int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this for compat? why are you writing it in the doc-string? (rather than raising on it if its not 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback : The parameter was in the signature but undocumented. I believe it is for compat since SparseDataFrame
actually uses this argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback : Should we just do validation as part of this PR for both SparseArray
and SparseSeries
then?
|
||
Returns | ||
------- | ||
cumsum : SparseSeries if `self` has a null `fill_value` and a | ||
generic Series otherwise | ||
cumsum : SparseSeries | ||
""" | ||
nv.validate_cumsum(args, kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. I would do axis = self._get_axis_number(axis)
(which validates).
and something similar to SparseArray
.
We normally don't list these in the doc-string Parameters (even though they are in the signature)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, though it was misleading because I had initially assumed the axis
parameter did something, even though it doesn't do anything in this implementation.
if notnull(self.fill_value): | ||
return self.to_dense().cumsum() | ||
if not self._null_fill_value: | ||
return SparseArray(self.to_dense()).cumsum() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
side thing. we should use _constructor
(and define it for SparseArray
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can make a separate issue / do later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good as a follow-up.
---------- | ||
axis : int | ||
Axis over which to perform the cumulative summation. This | ||
parameter is ignored because `SparseSeries` is 1-D, but it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason you are not matching the axis parameter for Series?
Help on function cumsum in module pandas.core.series:
cumsum(self, axis=None, skipna=True, *args, **kwargs)
Return cumulative sum over requested axis.
Parameters
----------
axis : {index (0)}
skipna : boolean, default True
Exclude NA/null values. If an entire row/column is NA, the result
will be NA
Returns
-------
cumsum : scalar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason really. Done.
ac042da
to
a472432
Compare
@jreback : I addressed all of the comments, and everything is passing. Ready to merge if there are no other concerns. |
|
||
Parameters | ||
---------- | ||
axis : 0 or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it cannot be None, must be 0 (as you set the default)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# TODO: gh-12855 - return a SparseSeries here | ||
return Series(new_array, index=self.index).__finalize__(self) | ||
if axis is not None: | ||
self._get_axis_number(axis) # Unused but hould be valid! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only do
axis=self._get_axis_number(axis)
and remove the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Always return SparseArray and SparseSeries for SparseArray.cumsum() and SparseSeries.cumsum() respectively, regardless of fill_value. Close pandas-devgh-12855.
a472432
to
83314fc
Compare
@jreback : Applied requested changes, and all is green. Ready to merge if there are no other concerns. |
lgtm. |
thanks! |
Always return `SparseArray` and `SparseSeries` for `SparseArray.cumsum()` and `SparseSeries.cumsum()` respectively, regardless of `fill_value`. Closes pandas-dev#12855. Author: gfyoung <[email protected]> Closes pandas-dev#14771 from gfyoung/sparse-return-type and squashes the following commits: 83314fc [gfyoung] API: Return sparse objects always for cumsum
Always return
SparseArray
andSparseSeries
forSparseArray.cumsum()
andSparseSeries.cumsum()
respectively, regardless offill_value
.Close #12855.