-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: faster unstacking #15510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: faster unstacking #15510
Conversation
21958b7
to
1bfa04c
Compare
Codecov Report
@@ Coverage Diff @@
## master #15510 +/- ##
==========================================
- Coverage 91.04% 91.02% -0.03%
==========================================
Files 136 136
Lines 49088 49105 +17
==========================================
+ Hits 44694 44698 +4
- Misses 4394 4407 +13
Continue to review full report at Codecov.
|
cc @wesm if you have a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. Don't love expansions of our sprawling Cython codebase, but this seems like a solid win as a pretty central data manipulation.
closes pandas-dev#15503 Author: Jeff Reback <[email protected]> Closes pandas-dev#15510 from jreback/reshape3 and squashes the following commits: ec29226 [Jeff Reback] PERF: faster unstacking
@@ -182,9 +185,21 @@ def get_new_values(self): | |||
stride = values.shape[1] | |||
result_width = width * stride | |||
result_shape = (length, result_width) | |||
mask = self.mask | |||
mask_all = mask.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback couldn use some help grokking how we get here. In Block._unstack adding an assertion assert mask.all()
doesn't break any tests. is that something we can rely on? (if so we can simplify code a good bit) If not, how can we construct a counter-example?
closes #15503
so on a non-masked unstack (IOW, a fully product multi-index for example), this is now just
a simple reshape. On a masked unstack, it now will have a much lower O constant, as its in cython, and with release the GIL.
0.19.2 / master
PR