-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fixed reindexing arith with duplicates #35303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed reindexing arith with duplicates #35303
Conversation
cc @jbrockmendel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep this is ok
Having trouble with ASV right now. Things seem OK in this small benchmark though In [15]: df1 = pd.DataFrame(np.random.randn(1000, 500))
In [16]: df2 = pd.DataFrame(np.random.randn(1000, 501))
In [17]: %timeit df1 + df2
# master
3.8 ms ± 115 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
# PR
3.77 ms ± 104 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) |
# https://github.com/pandas-dev/pandas/issues/35194 | ||
indexer, _ = result.columns.get_indexer_non_unique(join_columns) | ||
indexer = algorithms.unique1d(indexer) | ||
result = result._reindex_with_indexers( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a public method that could be used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I know of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do this type of thing in indexing.py, so should try to refactor later
LGTM |
thanks @TomAugspurger |
Closes #35194
Still need to run ASV on this. This was a regression from 0.25.x to 1.0. It doesn't have to go in 1.1, but probably better for 1.1.0.rc0 than a 1.1.1 release.