-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: Comparison methods for MultiIndex should have consistent behaviour for all nlevels (GH21149) #21195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Updating to 0.23.0
Update 18 May
24 May fork update
GH21149-1
GH21149-1
Removed the following test, which was causing builds to fail (?). This was working when tested on my command line (Mac OS Terminal) # Greater-than test: non-MultiIndex Index object vs MultiIndex object with tm.assert_raises_regex(TypeError, 'not supported'): midx > idx
Hello @KalyanGokhale! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on June 14, 2018 at 10:23 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #21195 +/- ##
=========================================
Coverage ? 91.9%
=========================================
Files ? 153
Lines ? 49607
Branches ? 0
=========================================
Hits ? 45590
Misses ? 4017
Partials ? 0
Continue to review full report at Codecov.
|
doc/source/whatsnew/v0.23.1.txt
Outdated
@@ -78,7 +78,8 @@ Indexing | |||
|
|||
- Bug in :meth:`Series.reset_index` where appropriate error was not raised with an invalid level name (:issue:`20925`) | |||
- Bug in :func:`interval_range` when ``start``/``periods`` or ``end``/``periods`` are specified with float ``start`` or ``end`` (:issue:`21161`) | |||
- | |||
- Bug in comparison operations for :class:`MultiIndex` where error was raised on equality / inequality comparison involving a MultiIndex with self.nlevels == 1 (:issue:`21149`) | |||
- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double backticks on MultiIndex
pandas/tests/indexes/test_multi.py
Outdated
(pd.MultiIndex.from_product([[0, 1]]), pd.Series(range(2)), 2)]) | ||
def test_multiindex_compare(self, midx, idx, count): | ||
# GH 21149 | ||
'''Ensure comparison operations for MultiIndex with nlevels == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use triple-double quotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pandas/tests/indexes/test_multi.py
Outdated
''' | ||
expected = pd.Series([True]).repeat(count) | ||
expected.reset_index(drop=True, inplace=True) | ||
# Equality self-test: MultiIndex object vs self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blank line between cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pandas/tests/indexes/test_multi.py
Outdated
behave consistently with those for MultiIndex with nlevels > 1 | ||
''' | ||
expected = pd.Series([True]).repeat(count) | ||
expected.reset_index(drop=True, inplace=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't use inplace in tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback Thanks - here inplace
is being used to only create the expected outcome for testing. inplace
is not being used in the test assert_series_equal
OK to keep then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done - I have simplified the tests.
now only kept the required cases - as such have removed the parametrisation and also use of inplace
in creating the expected results
pandas/tests/indexes/test_multi.py
Outdated
result = pd.Series(midx == midx) | ||
tm.assert_series_equal(result, expected) | ||
# Equality self-test: non-MultiIndex Index object vs self | ||
result = (idx == idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where did the idea for these test come from? what exactly are you trying to test here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are trying to ensure the behaviour on comparison between MultiIndex vs MultiIndex
, MultiIndex vs Index
is consistent, irrespective of nlevels
. Also, ensuring that the comparison behaviour for Index vs Index
has not changed due to this PR.
Currently (as of 0.23.0), comparing MultiIndex
of nlevels==1
with another of same length raises a ValueError
e.g.
[In] midx=pd.MultiIndex.from_product([[0, 1]])
[In] midx
[Out] MultiIndex(levels=[[0, 1]],
labels=[[0, 1]])
[In] midx == midx
[Out] ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
whereas the behaviour should be consistent with that for MultiIndex
with nlevels>1
as follows:
[In] midx == midx
[Out] array([ True, True])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this test, probably should be OK to only keep the midx == midx
and midx > midx
checks
Good catch! |
GH21149-1a
Rebased and updated whatsnew v0.23.2
Any other work needed on this PR? Thanks |
thanks, if anything residual is remaining from #21149 let's open a new issue |
…ur for all nlevels (GH21149) (pandas-dev#21195)
…ur for all nlevels (GH21149) (pandas-dev#21195)
git diff upstream/master -u -- "*.py" | flake8 --diff
Initial PR #21182 is closed - and now split in 2 different PRs
def cmp_method
raised ValueError for equality / inequality comparisons of MultiIndex with nlevels == 1, which was inconsistent with behaviour for MultiIndex with nlevels > 1 (details of issue below) - this has now been fixedCurrently (as of 0.23.0), comparing MultiIndex of nlevels==1 with another of same length raises a ValueError e.g.
whereas the behaviour should be consistent with that for MultiIndex with nlevels>1 as follows: