-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: CategoricalIndex allows reindexing with non-unique CategoricalIndex #23963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @qwhelan! Thanks for updating the PR.
Comment last updated on November 28, 2018 at 22:26 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #23963 +/- ##
=========================================
Coverage ? 92.3%
=========================================
Files ? 161
Lines ? 51558
Branches ? 0
=========================================
Hits ? 47592
Misses ? 3966
Partials ? 0
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you a whatsnew entry, and replicate the df.round() (that should work now) in the OP
54ab47e
to
8366590
Compare
@jreback I stumbled across this bug while writing some As we're now raising on non-unique This is already done in |
8366590
to
eec207a
Compare
Additionally, this is the fixed output from #21809:
|
6450827
to
0087ccf
Compare
pandas/tests/frame/test_analytics.py
Outdated
idx = pd.CategoricalIndex(['low'] * 3 + ['hi'] * 3) | ||
dfb = pd.DataFrame(np.random.rand(6, 3), columns=list('abc'), | ||
index=idx) | ||
assert dfb.shape == (6, 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you construct the result and use tm.assert_frame_equal here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a tm.assert_frame_equal()
check but keeping the explicit dims check as I don't want to inadvertently rely on CategoricalIndex.reindex
.
0087ccf
to
95337f9
Compare
@jreback Changes implemented, the one failing test is a flaky hypothesis failure in an unrelated file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cosmestic change in tests. ping when green.
95337f9
to
9550880
Compare
@jreback Green other than the unrelated flaky test |
thanks @qwhelan |
CategoricalIndex.reindex()
throws aValueError
if the target index is non-unique and not also aCategoricalIndex
. This means a non-uniqueCategoricalIndex
can be used to reindex, resulting in duplicated data:This PR removes the extraneous type check and adds tests checking that an exception is thrown for non-unique target indexes, regardless of type.
git diff upstream/master -u -- "*.py" | flake8 --diff