Skip to content

Sparse Ops Cleanup #19782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 21, 2018
Merged

Sparse Ops Cleanup #19782

merged 2 commits into from
Feb 21, 2018

Conversation

jbrockmendel
Copy link
Member

This is in preparation for implementing the direct-pinning discussed here. The main roadblock to that is SparseSeries which calls ops.add_special_arithmetic_methods twice. This PR gets rid of the double-call, so the follow-up will be straight-forward.

A handful of cleanups are done here too:

  • unrelated docstring fix for TimedeltaIndex.replace
  • minor formatting improvements
  • remove an unused kwarg for DataFrame._compare_frame
  • remove a loop in add_flex_arithmetic_methods, replace with an assertion that it is unnecessary
  • fix names for some SparseArray method names, e.g. ATM SparseArray.__eq__.__name__ == 'eq'

def wrapper(self, other):
is_self_int_dtype = is_integer_dtype(self.dtype)

fill_int = lambda x: x.fillna(0)
fill_bool = lambda x: x.fillna(False).astype(bool)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving these outside of the function scope to clarify they don't need to be re-defined on each call. _bool_method_SERIES is the topic of a separate PR, but this is sufficiently small to include in assorted-cleanups.

@codecov
Copy link

codecov bot commented Feb 20, 2018

Codecov Report

Merging #19782 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19782      +/-   ##
==========================================
+ Coverage   91.61%   91.61%   +<.01%     
==========================================
  Files         150      150              
  Lines       48887    48882       -5     
==========================================
- Hits        44786    44782       -4     
+ Misses       4101     4100       -1
Flag Coverage Δ
#multiple 89.98% <100%> (ø) ⬆️
#single 41.79% <28.57%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/timedeltas.py 90.56% <ø> (ø) ⬆️
pandas/core/sparse/series.py 95.25% <ø> (-0.02%) ⬇️
pandas/core/ops.py 96.86% <100%> (+0.12%) ⬆️
pandas/core/sparse/frame.py 94.81% <100%> (ø) ⬆️
pandas/core/frame.py 97.22% <100%> (ø) ⬆️
pandas/core/sparse/array.py 91.46% <100%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update af5e8ec...2b81a5c. Read the comment docs.

@@ -1568,7 +1562,7 @@ def wrapper(self, other):
dtype = getattr(other, 'dtype', None)
other = SparseArray(other, fill_value=self.fill_value,
dtype=dtype)
return _sparse_array_op(self, other, op, name)
return _sparse_array_op(self, other, op, name, series=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this whole sparse interaction is pretty hacky (I know not addressing this now), but....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, there are a couple of improvements in store for the pass after next (which is just implementing the single-call ops.add_whatever_methods(cls))

@@ -1591,4 +1583,5 @@ def wrapper(self, other):

sparse_series_special_funcs = dict(arith_method=_arith_method_SPARSE_SERIES,
comp_method=_arith_method_SPARSE_SERIES,
bool_method=None)
bool_method=_bool_method_SERIES)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we hold off adding till then

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary to get rid of the double-call in sparse.series.

@jreback jreback added Sparse Sparse Data Type Clean labels Feb 20, 2018
@jreback jreback added this to the 0.23.0 milestone Feb 21, 2018
@jreback jreback merged commit 8382067 into pandas-dev:master Feb 21, 2018
@jreback
Copy link
Contributor

jreback commented Feb 21, 2018

thanks

@jbrockmendel jbrockmendel deleted the spops3 branch February 21, 2018 00:18
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants