Skip to content

BUG: Exception raised for a duplicate MultiIndex level name #9399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
seth-p opened this issue Feb 3, 2015 · 4 comments
Closed

BUG: Exception raised for a duplicate MultiIndex level name #9399

seth-p opened this issue Feb 3, 2015 · 4 comments
Labels
Enhancement Error Reporting Incorrect or improved errors from pandas MultiIndex

Comments

@seth-p
Copy link
Contributor

seth-p commented Feb 3, 2015

The following two tests seem inconsistent. Why does the first test for a KeyError while the second tests for a ValueError? Also, the first error message, Level foo not found, doesn't seem correct.

test_index.py:

    def test_duplicate_names(self):
        self.index.names = ['foo', 'foo']
        assertRaisesRegexp(KeyError, 'Level foo not found',
                           self.index._get_level_number, 'foo')

test_frame.py:

    def test_unstack_non_unique_index_names(self):
        idx = MultiIndex.from_tuples([('a', 'b'), ('c', 'd')],
                                     names=['c1', 'c1'])
        df = DataFrame([1, 2], index=idx)
        with tm.assertRaises(ValueError):
            df.unstack('c1')

        with tm.assertRaises(ValueError):
            df.T.stack('c1')
@shoyer
Copy link
Member

shoyer commented Feb 3, 2015

I agree -- that first error looks very strange. It's particularly strange that the test is for a private method. We probably should be checking for duplicate names when index.names is set, and should be raising ValueError.

@jreback
Copy link
Contributor

jreback commented Feb 4, 2015

private methods should be explicity tested just as public api methods.

These tests are for 2 different things. The first because of an invalid level name KeyError makes senses. The 2nd for an invalid reshape operation (see below).

The issue here is this. a MultiIndex technically can have duplicate names, but in point of fact they are pretty much useless and tend to break / degrade other ops. E.g. stack/unstack, and make your columns duplicate when reset. So I think we should actually raise on construction to solve this entire problem.

So names can be:

  • None (where they get defaulted level numbers)
  • unique integers / names
  • None / mixed integers/names (usually arise from non-named reduction ops).

@jreback jreback added API Design MultiIndex Error Reporting Incorrect or improved errors from pandas labels Feb 4, 2015
@seth-p
Copy link
Contributor Author

seth-p commented Feb 4, 2015

Sounds good to me!

@jreback jreback added this to the Next Major Release milestone Aug 11, 2016
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@mroeschke
Copy link
Member

Seems like there hasn't been much interest in this feature over the years so closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Error Reporting Incorrect or improved errors from pandas MultiIndex
Projects
None yet
4 participants