Skip to content

Support resetting index with tuple name #16165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

toobaz
Copy link
Member

@toobaz toobaz commented Apr 28, 2017

Notice the first commit is pure refactoring, and the bug is actually fixed by the one-liner second commit.

@toobaz toobaz changed the title Reix col name Support resetting index with tuple name Apr 28, 2017
@codecov
Copy link

codecov bot commented Apr 28, 2017

Codecov Report

Merging #16165 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16165      +/-   ##
==========================================
+ Coverage   90.87%   90.88%   +0.01%     
==========================================
  Files         162      162              
  Lines       50816    50806      -10     
==========================================
- Hits        46178    46174       -4     
+ Misses       4638     4632       -6
Flag Coverage Δ
#multiple 88.66% <100%> (ø) ⬆️
#single 40.33% <68.42%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/frame.py 97.58% <100%> (-0.02%) ⬇️
pandas/core/generic.py 91.63% <0%> (+0.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 075eca1...e12bca1. Read the comment docs.

@codecov
Copy link

codecov bot commented Apr 28, 2017

Codecov Report

Merging #16165 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16165      +/-   ##
==========================================
+ Coverage   90.85%   90.86%   +0.01%     
==========================================
  Files         162      162              
  Lines       50826    50819       -7     
==========================================
- Hits        46177    46176       -1     
+ Misses       4649     4643       -6
Flag Coverage Δ
#multiple 88.64% <100%> (+0.01%) ⬆️
#single 40.33% <63.63%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/frame.py 97.58% <100%> (-0.01%) ⬇️
pandas/core/generic.py 91.63% <0%> (+0.32%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b6f65eb...9e1bdba. Read the comment docs.

@jreback jreback added this to the 0.20.0 milestone Apr 28, 2017
@jreback
Copy link
Contributor

jreback commented Apr 28, 2017

this looks good. can you add a test which where you have another level already (and check the resetting), e.g.

In [3]: df = pd.DataFrame([[1, 2]], columns=pd.MultiIndex.from_product([['A'], ['a', 'b']]))
   ...: 
   ...: reind = df.set_index([('A', 'a')], append=True)

it works, just want to cover the bases. otherwise lgtm.

@jorisvandenbossche
Copy link
Member

Can you check some more pathological cases?
Eg when the length of the tuple does not match the number of levels? Not sure what the result should be of that (informative error or insert the tuple in first level), though. But good to at least assert its behaviour.

@toobaz
Copy link
Member Author

toobaz commented Apr 28, 2017

Not sure what the result should be of that (informative error or insert the tuple in first level), though.

The current error is a quite informative "*** ValueError: Item must have length equal to number of levels.", and I think inserting tuple in first level would break the principle of least surprise.

However, it probably makes sense to use fill_value to fill keys which are shorter than the number of levels, rather than raising, so I added code making this happen.

@jreback
Copy link
Contributor

jreback commented Apr 28, 2017

thanks!

@toobaz toobaz deleted the reix_col_name branch April 28, 2017 21:39
@jorisvandenbossche
Copy link
Member

@toobaz Probably a not so important corner case, but could you also check how this interacts with the col_level argument?

pcluo pushed a commit to pcluo/pandas that referenced this pull request May 22, 2017
closes pandas-dev#16164

Author: Pietro Battiston <[email protected]>

Closes pandas-dev#16165 from toobaz/reix_col_name and squashes the following commits:

9e1bdba [Pietro Battiston] REF: reorganize reinsertion code
3b0bb1f [Pietro Battiston] ENH: Handle tuples shorter than nlevels gracefully
c958de7 [Pietro Battiston] TST: additional test for reset_index with tuple-named index level
e12bca1 [Pietro Battiston] ENH: allow tuple index names to be interpreted as full column keys
6315d07 [Pietro Battiston] REF: Avoid duplication in reset_index() when reinsering index columns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

df.reset_index() fails if df.index.name is a tuple
3 participants