Skip to content

Revert "REF: remove special casing from Index.equals (always dispatchto subclass) (#35330)" #38560

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 18, 2020

Conversation

simonjayhawkins
Copy link
Member

This reverts commit 0b90685.

       before           after         ratio
     [54682234]       [05c97adb]
     <master>         <revert-35330>
-         293±5ms       3.75±0.2μs     0.00  index_object.IndexEquals.time_non_object_equals_multiindex

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
       before           after         ratio
     [54682234]       [05c97adb]
     <master>         <revert-35330>
-      1.44±0.1ms         800±20μs     0.56  reindex.LevelAlign.time_reindex_level

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

@simonjayhawkins simonjayhawkins added Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version labels Dec 18, 2020
@simonjayhawkins simonjayhawkins added this to the 1.2 milestone Dec 18, 2020
@simonjayhawkins
Copy link
Member Author

more benchmark results

       before           after         ratio
     [54682234]       [05c97adb]
     <master>         <revert-35330>
-     2.97±0.09μs      2.67±0.02μs     0.90  index_object.Indexing.time_slice_step('Float')
-        554±30ns         497±10ns     0.90  index_object.Range.time_get_loc_inc
-      8.56±0.1μs       7.65±0.2μs     0.89  index_object.Indexing.time_get_loc_sorted('Float')
-      2.94±0.2μs      2.63±0.02μs     0.89  index_object.Indexing.time_slice('Float')
-        330±10μs          294±3μs     0.89  index_object.IntervalIndexMethod.time_intersection(1000)
-     5.40±0.09μs      4.79±0.07μs     0.89  index_object.Indexing.time_get_loc_sorted('Int')
-     1.02±0.03ms          898±3μs     0.88  index_object.IntervalIndexMethod.time_intersection_both_duplicate(1000)
-      5.42±0.3ms      4.75±0.01ms     0.87  index_object.SetOperations.time_operation('int', 'symmetric_difference')
-      14.4±0.5ms       12.4±0.1ms     0.86  index_object.Range.time_iter_inc
-      10.1±0.8ms      8.71±0.06ms     0.86  index_object.IntervalIndexMethod.time_intersection(100000)
-      10.3±0.9μs       8.83±0.2μs     0.86  index_object.Indexing.time_get_loc_non_unique_sorted('Int')
-      3.24±0.2μs      2.77±0.02μs     0.85  index_object.Indexing.time_slice('Int')
-         108±6ms       90.6±0.5ms     0.84  index_object.IntervalIndexMethod.time_intersection_both_duplicate(100000)
-        329±10μs          274±3μs     0.83  index_object.IntervalIndexMethod.time_intersection_one_duplicate(1000)
-        25.9±2μs       21.2±0.2μs     0.82  index_object.Indexing.time_get_loc_non_unique_sorted('Float')
-      7.42±0.5ms      5.69±0.07ms     0.77  index_object.SetDisjoint.time_datetime_difference_disjoint
-        16.3±1ms      12.5±0.05ms     0.77  index_object.SetOperations.time_operation('date_string', 'union')
-        4.98±1ms       3.65±0.2ms     0.73  index_object.Indexing.time_get_loc_non_unique('Float')
-        303±20ms       3.86±0.9μs     0.00  index_object.IndexEquals.time_non_object_equals_multiindex

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @jbrockmendel if any comments

and type(other) is not type(self)
and other.equals is not self.equals
):
if is_object_dtype(self.dtype) and not is_object_dtype(other.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel

let's make this an elif & put a big note that these are perf sensitive checks (followon)

and type(other) is not type(self)
and other.equals is not self.equals
):
if is_object_dtype(self.dtype) and not is_object_dtype(other.dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this also hits the MultiIndex path

@jbrockmendel
Copy link
Member

LGTM

@simonjayhawkins
Copy link
Member Author

@jreback OK to merge this as a straight revert for backport and then follow-up for master

@jreback jreback merged commit 21b57fa into pandas-dev:master Dec 18, 2020
@simonjayhawkins simonjayhawkins deleted the revert-35330 branch December 18, 2020 19:05
@simonjayhawkins
Copy link
Member Author

@meeseeksdev backport 1.2.x

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Dec 18, 2020
simonjayhawkins added a commit that referenced this pull request Dec 18, 2020
…als (always dispatchto subclass) (#35330)" (#38565)

Co-authored-by: Simon Hawkins <[email protected]>
luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version
Projects
None yet
3 participants