Skip to content

fixed header=list (to create a MultiIndex) for read_excel #9637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from
Closed

fixed header=list (to create a MultiIndex) for read_excel #9637

wants to merge 7 commits into from

Conversation

gdementen
Copy link
Contributor

closes #4679

This is a very simple fix to allow passing a list go the header argument of read_excel so that it creates a MultiIndex for columns. Given what is said in issue #4679, I am not really sure it is supposed to work, but with this fix, it works for my files. One thing discussed in issue #4679, which I did not do anything about (and thus probably still does not work) is if the header cells are merged, but my opinion is that this should be a separate issue, as it is already useful to be able to read files with the header duplicated (the files I receive are usually like this). If this PR is accepted (at least in spirit), the docstring for read_excel should probably be updated to reflect the "new" capability and the changelog updated, but I wanted to check first what you think.

As a side note, I am not entirely sure that _trim_excel_header is a good thing to do anyway, but that is another debate entirely.

@jreback
Copy link
Contributor

jreback commented Mar 13, 2015

needs tests

@jreback jreback added the IO Excel read_excel, to_excel label Mar 13, 2015
@jreback jreback added this to the Next Major Release milestone Mar 13, 2015
@jreback
Copy link
Contributor

jreback commented May 9, 2015

@gdementen can you rebase this. What are all of the changes in index.pyx for?

@gdementen
Copy link
Contributor Author

First, sorry for letting this rot for so long.

I have since then realized that for my use cases, the code this PR changes (_trim_excel_header) should be removed altogether rather than changed. I have postponed proposing to do that because this code was probably put there for a reason and thus removing it will break backward compatibility for someone somewhere even though this seems like a more sensible way to do things (it makes the behavior the same than for csv files). If I knew what the use case this code is meant to handle, I would probably add an option for this, but I don't.

The hashtable.pyx changes are completely unrelated. I simply forgot to create a branch for my changes before sending the pull request... I will close this PR and create two separate ones.

@gdementen gdementen closed this May 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Excel read_excel, to_excel MultiIndex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Excel - allow for multiple rows to be treated as hierarchical columns
2 participants