-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF/COMPAT: define platform int to np.intp #13972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Current coverage is 85.29% (diff: 100%)@@ master #13972 diff @@
==========================================
Files 139 139
Lines 50219 50245 +26
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 42819 42854 +35
+ Misses 7400 7391 -9
Partials 0 0
|
@@ -767,6 +767,40 @@ Note that the limitation is applied to ``fill_value`` which default is ``np.nan` | |||
- Bug in ``SparseSeries.abs`` incorrectly keeps negative ``fill_value`` (:issue:`13853`) | |||
- Bug in single row slicing on multi-type ``SparseDataFrame``s, types were previously forced to float (:issue:`13917`) | |||
|
|||
Indexer dtype Changes | |||
^^^^^^^^^^^^^^^^^^^^^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the issue ref here.
yes this looks reasonable. go ahead an make ready on windows. Also add an asv as above. |
pls update the docs as indicated. otherwise lgtm. |
What else were you looking for in the docs? I did add an issue ref in the first paragraph. |
ok this looks fine. can you run this on 32-bit linux as well and see that it comes up clean (IIRC you tests on windows so that should be good). |
The changes I just pushed were to get this passing on 32 bit windows. I don't have anything handy to test 32 bit Linux, I can probably set up a VM, although I think it should generally act the same 32 bit Windows ( |
ok I'll run on 32 bit tomorrow I just use a macosx / vm |
This currently fails on 32-bit linux (before your PR). I recall had to do something on windows to get this to pass. Something wrong somewhere. Your PR fixes this! yeah!
So as-is this passes on linux32 (and fixes the above). lmk when good to go. |
Cool - fixed the lint issue so this should be good to go. |
thanks! |
AFAIK this only affects 64 bit python on Windows.
numpy
wants annp.intp
(i8 on Windows) as a indexer fortake
, but pandas defines a "platform int" as anp.int_
(i4 on Windows). This hits performance twice, because we often start with i8, cast to i4, then numpy will cast back to i8 in itstake
.This is an alternative to #13924 - there I explored replacing
ndarray.take
with ourtake_nd
fully, but this approach solves the perf problem without much pain or risking new segfaults.I'd still need to adjust a bunch of tests here to pass on Windows. This is an API change for "advanced" methods like
get_indexer
, but I don't think anything is necessary beyond a doc note?ASV: