Skip to content

BUG: (GH4145/4146) Fixed bugs in multi-index selection with column multi index duplicates #4148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 6, 2013

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Jul 6, 2013

closes #4145, #4146

In [1]: data = """h1 main  h3 sub  h5
   ...: 0  a    A   1  A1   1
   ...: 1  b    B   2  B1   2
   ...: 2  c    B   3  A1   3
   ...: 3  d    A   4  B2   4
   ...: 4  e    A   5  B2   5
   ...: 5  f    B   6  A2   6"""

In [2]: df = pd.read_csv(StringIO(data),sep='\s+',index_col=0)

In [3]: df2 = df.set_index(['main', 'sub']).T.sort_index(1)

In [4]: df2.loc[:,('A','A1')]
Out[4]: 
main  A
sub  A1
h1    a
h3    1
h5    1

In [5]: df2[('A','A1')]
Out[5]: 
main  A
sub  A1
h1    a
h3    1
h5    1

In [6]: df2['A']['A1']
Out[6]: 
   A1
h1  a
h3  1
h5  1

This should NOT work, as this is tryingo select 2 different columns ('A' is ok, but 'A1' fails which is correct)

In [7]: df2[['A','A1']]
KeyError: "['A1'] not in index"

This is ok though

In [8]: df2[[('A','A1')]]
Out[8]: 
main  A
sub  A1
h1    a
h3    1
h5    1

@jreback
Copy link
Contributor Author

jreback commented Jul 6, 2013

@hayd cc @jtratner

tell me if you think this covers it....

@hayd
Copy link
Contributor

hayd commented Jul 6, 2013

Looks good to me. Is df2['A']['B2'] behaviour already tested elsewhere?

@jreback
Copy link
Contributor Author

jreback commented Jul 6, 2013

here are the cases

2 level index
unique - unique
unique - non_unique

select top level
select top level and 2nd level unique
select top level and 2nd level non_unique

maybe should put a comprehensive test
pretty sure all tested - maybe move all to a single test method?

jreback added a commit that referenced this pull request Jul 6, 2013
BUG: (GH4145/4146) Fixed bugs in multi-index selection with column multi index duplicates
@jreback jreback merged commit 565ee0c into pandas-dev:master Jul 6, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame MultiIndex column access (and pop)
2 participants