-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: Faster comparisons of indexes when compared to self #37109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
that works for int dtype, but what about for dtypes that have NAs? |
Hmm, that’s a good point. I’ll see if anything can be done about that. |
you could return |
even then you'd need to watch out for pd.NA I think just check for _can_hold_na. Also for RangeIndex could compare the range objects instead of (i think) casting to ndarray |
c81337c
to
ef5ad65
Compare
ef5ad65
to
534afbc
Compare
I've made an update. I can't find the tests for index comparisons, they clearly need some tests for comparisons with indexes that contain nans. My first proposal obviously shouldn't have passed. Anyonw know where they're located? |
tests.arithmetic |
if self.is_(other) and not isinstance(self, MultiIndex): | ||
# short-circuit when other is same as self | ||
# if we're comparing equality, return an np.ones array, else an np.zeros arr | ||
bool_arr_type, na_bool = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the pattern we use elsewhere is more like
res = np.empty(self.shape, dtype=bool)
res[:] = (op.__name__ in ["eq", "le", "ge"]
if self._can_hold_na:
...
would this let us remove NumericIndex._cmp_method? |
@topper-123 can you merge master |
@topper-123 still worth it here? can you merge master |
This has already been fixed, e.g. the last example: >>> idx = rng.astype(object)
>>> %timeit idx == idx
5.97 ms ± 12.7 µs per loop # when this PR was opened
53.9 µs ± 9.8 µs per loop # master So I'll just close this one. |
Examples: