-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN/PERF: remove ndarray.take and platform int conversions #13924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
More broadly, is there ever a reason to coerce indexers back to a platform int? eg. here Even ignoring the conversion cost, it seems like In [1]: np.random.seed(123)
In [2]: a = np.random.randn(1000000)
In [3]: idx = np.random.randint(0, 1000000, size=1000000).astype('int64')
In [4]: idx_plat = idx.astype(np.int_)
In [5]: %timeit a.take(idx)
100 loops, best of 3: 13.9 ms per loop
In [6]: %timeit a.take(idx_plat)
100 loops, best of 3: 17.5 ms per loop |
Current coverage is 85.30% (diff: 100%)@@ master #13924 diff @@
==========================================
Files 139 139
Lines 50143 50138 -5
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
- Hits 42777 42768 -9
- Misses 7366 7370 +4
Partials 0 0
|
there was some discussion of this quite a while ago so you could cast as appropriate in only certain situations (vs all of the ensure stuff that we do) I think we have enough tests that you could try things and see what breaks |
pls rebase as just merged: #13925 |
Ah, so |
xref #3033 - platform int stuff is a can of worms |
right that was the issue, yes it is a can-o-worms |
I guess I'm re-purposing this to now be about platform ints, closing #3033. Assuming we actually want to do this, I'd still need to adjust a bunch of tests and root out any remaining corner cases, but here's essentially what I've done / think I'm proposing:
This generally should help performance on Windows 64, by avoiding unneeded casts and using a better indexer. I'm hoping neutral, to slighter better (fewer checks) elsewhere. |
sorter = values.argsort() | ||
ordered = values.take(sorter) | ||
sorter = _ensure_int64(values.argsort()) | ||
ordered = take_nd(values, sorter, allow_fill=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do as many ensures inside take_nd that u can - so we can just call it with anything and it will work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree - take_nd
already does this check - so I didn't need this
I think I went down the wrong path here - there are places where we were implicitly relying on numpy's boundschecking, and I worry about introducing segfaults with the unchecked So instead, I think I may just redefine platform_int to |
Closed in favor of #13972 |
I was looking into #13745 (GIL on merge), and there's some low hanging fruit in the non-parallel case.
any
callsLinux 64 this doesn't really make a difference.