BUG: incorrect set_labels in MI #19058

Mofef · 2018-01-03T16:57:40Z

This fixes #19057

codecov · 2018-01-03T18:14:57Z

Codecov Report

Merging #19058 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #19058      +/-   ##
==========================================
+ Coverage   91.53%   91.53%   +<.01%     
==========================================
  Files         148      148              
  Lines       48688    48689       +1     
==========================================
+ Hits        44566    44567       +1     
  Misses       4122     4122

Flag	Coverage Δ
#multiple	`89.9% <100%> (ø)`	⬆️
#single	`41.62% <0%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/multi.py	`96.22% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 821028f...48fb843. Read the comment docs.

jreback

this looks ok, but pls add your original test (both with and w/o inplace) as well as a note in the whatsnew

jreback · 2018-01-04T00:04:21Z

cc @toobaz

any thoughts here

toobaz · 2018-01-04T11:12:15Z

Looks good to me. As for tests, I guess it's just a matter of understanding what went wrong in this one, and fix it.

Mofef · 2018-01-04T11:26:51Z

you probably mean

pandas/pandas/tests/indexes/test_multi.py

Line 287 in 9303315

assert_matching(self.index.labels, labels)

set_levels works correctly iirc.
The issue with the test is that self.index has the same magnitude of categories in each level. (i.e. the length is each smaller than int8_max). So the misalignment does not have an effect. Also there is no misalignment if the labels get replaced in the first n levels.
I would like to take care of the tests later.

toobaz · 2018-01-04T12:39:12Z

you probably mean

yep, sorry, actually this assertion:

pandas/pandas/tests/indexes/test_multi.py

Line 302 in 9303315

assert_matching(self.index.labels, labels)

But I agree with your explanation.

jreback · 2018-01-04T15:26:36Z

@Mofef tests need to be in the same issue as the PR fixing.

Mofef · 2018-01-04T17:23:34Z

@jreback Sorry I don't understand, you mean i should also add tests to this PR?

In the meanwhile i looked into test_multi.py. My initial idea was to just increase the size of the minor level with something like

    def setup_method(self, method):
        major_axis = Index(['foo', 'bar', 'baz', 'qux'])
        minor_axis = Index(range(2, 128))

        major_labels = np.random.choice(range(4), 128)
        minor_labels = np.array(range(128))
...

However, that doesn't seem to be feasible since a lot of test methods rely on the current "reference index". So I get ~50 presumably false negative tests.
Another option would be to create a bigger MultiIndex down in test_set_labels. On the other hand it's probably a good idea to check each operation against such an index with levels of different dtype.
What do you think?

jreback · 2018-01-04T17:27:55Z

@Mofef yes you need a test, you can use the one that you put in the issue. you need to assert that this works (eg. construct the expected MI).

Mofef · 2018-01-05T10:46:10Z

Done. I had to define another test method, with its own assert_matching function, since inside test_set_labels the expected labels get casted to int8. Imho this should happen outside of a function called "assert_matching"

I also stumbled upon #19092

Edit: I forgot about the whatsnew...

jreback · 2018-01-05T13:45:50Z

pandas/tests/indexes/test_multi.py

+
+        ind = pd.MultiIndex.from_tuples([(0, i) for i in range(130)])
+        new_labels = range(129, -1, -1)
+        ind_reference = pd.MultiIndex.from_tuples(


call this expected

jreback · 2018-01-05T13:47:15Z

pandas/tests/indexes/test_multi.py

+
+        # [w/o  mutation]
+        ind2 = ind.set_labels(labels=new_labels, level=1)
+        assert_equal(ind2, ind_reference)


call these result, then you can simply do

assert result.equals(expected)

for the inplace case you need to copy before

jreback · 2018-01-05T13:47:29Z

pandas/tests/indexes/test_multi.py

@@ -327,6 +327,28 @@ def assert_matching(actual, expected):
        assert_matching(ind2.labels, new_labels)
        assert_matching(self.index.labels, labels)

+    def test_set_label_distinctly_sized_levels(self):
+        # label changing for levels of different magnitude of categories
+        def assert_equal(actual, expected):


see below this is not necessary

Mofef · 2018-01-05T14:16:12Z

Cool, thanks. PTAL

jreback · 2018-01-05T14:21:11Z

looks good. ping on green.

Mofef · 2018-01-05T16:08:45Z

@jreback https://travis-ci.org/pandas-dev/pandas/jobs/325457802#L448

648.46s$ git fetch origin +refs/pull/19058/merge:
fatal: unable to access 'https://github.com/pandas-dev/pandas.git/': Failed to connect to github.com port 443: Connection timed out

TomAugspurger · 2018-01-05T22:22:39Z

Sorry about the extra commits @Mofef. I fixed a merge conflict in the release notes and the CI services apparently had an issue with that. They're running again now.

Mofef · 2018-01-05T22:35:58Z

Oh ok no problem. thank you :)

jreback · 2018-01-06T17:25:28Z

thanks @Mofef

Fix pandas-dev#19057

ee3204c

jreback requested changes Jan 4, 2018

View reviewed changes

jreback changed the title ~~Fix #19057~~ BUG: incorrect set_labels in MI Jan 4, 2018

jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions MultiIndex labels Jan 4, 2018

Test set_labels for distinctly sized index levels

caf1a65

Add whatsnew entry

3c79f35

jreback requested changes Jan 5, 2018

View reviewed changes

Simplify added test code

2ad4fa9

jreback added this to the 0.23.0 milestone Jan 5, 2018

jreback approved these changes Jan 5, 2018

View reviewed changes

TomAugspurger and others added 2 commits January 5, 2018 16:20

Merge branch 'master' into patch-1

14fd575

CI: Trigger CI with empty commit

48fb843

jreback merged commit e6ea00c into pandas-dev:master Jan 6, 2018

jreback mentioned this pull request Jan 6, 2018

CLN: remove need for assert_matching in pandas/tests/indexes/test_multi.py #19105

Open

Uh oh!

BUG: incorrect set_labels in MI #19058

BUG: incorrect set_labels in MI #19058

Uh oh!

Conversation

Mofef commented Jan 3, 2018

Uh oh!

codecov bot commented Jan 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

jreback commented Jan 4, 2018

Uh oh!

toobaz commented Jan 4, 2018

Uh oh!

Mofef commented Jan 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

toobaz commented Jan 4, 2018

Uh oh!

jreback commented Jan 4, 2018

Uh oh!

Mofef commented Jan 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Jan 4, 2018

Uh oh!

Mofef commented Jan 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback Jan 5, 2018

Choose a reason for hiding this comment

Uh oh!

jreback Jan 5, 2018

Choose a reason for hiding this comment

Uh oh!

jreback Jan 5, 2018

Choose a reason for hiding this comment

Uh oh!

Mofef commented Jan 5, 2018

Uh oh!

jreback commented Jan 5, 2018

Uh oh!

Mofef commented Jan 5, 2018

Uh oh!

TomAugspurger commented Jan 5, 2018

Uh oh!

Mofef commented Jan 5, 2018

Uh oh!

jreback commented Jan 6, 2018

Uh oh!

Uh oh!

codecov bot commented Jan 3, 2018 •

edited

Loading

Mofef commented Jan 4, 2018 •

edited

Loading

Mofef commented Jan 4, 2018 •

edited

Loading

Mofef commented Jan 5, 2018 •

edited

Loading