Skip to content

New Excel functionality #2478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Dec 10, 2012 · 8 comments
Closed

New Excel functionality #2478

ghost opened this issue Dec 10, 2012 · 8 comments
Labels
Enhancement Ideas Long-Term Enhancement Discussions Indexing Related to indexing on series/frames, not to indexes themselves IO Excel read_excel, to_excel
Milestone

Comments

@ghost
Copy link

ghost commented Dec 10, 2012

Initial implementation was #2370 by @locojay, which was very nice but suffered from
several corner-cases and ambiguity when the files were read back in.
f0aa065 was a stopgap measure to roll back the functionality so 0.10 was not delayed.
The enhancements should be brought back in 0.11, enahnced with parsing of
multiindex on both axes, support for index names everywhere, decisions on best
defaults, arguments and so on.

@locojay, are you still interested in completing this after a bit more discussion?

@jassinm
Copy link

jassinm commented Dec 11, 2012

sure I will look into it.

The ExcelReader will need to be decoupled from the TextReader....

If we agree that all what get's dumped should be read in the same fashion in a routrip we are good to go.

for example dumping a Df with a Mutiindex Column will result in merged cells probably in multiple levels. These merged cells musst be present to be read the df back

@ghost
Copy link
Author

ghost commented Dec 11, 2012

@locojay, agreed that as much as possible should be recreated on read: names, labels, index structure.

Rather then relying on style information, I suggest using the following convention, which should
allow automagically inferring what information is present in the file:

Example 1
|   | x |   | x | x |
|   | x |   | x | x |
| x | x |   | x | x |
|   |   |   |   |   |
| x | x |   | x | x |
| x | x |   | x | x |
| x | x |   | x | x |

Example 2
|   |   |   | x | x |
|   |   |   | x | x |
| x | x |   | x | x |
|   |   |   |   |   |
| x | x |   | x | x |
| x | x |   | x | x |
| x | x |   | x | x |

Example 3
|   |   |   | x | x |
|   |   |   |   |   |
| x | x |   | x | x |
| x | x |   | x | x |
| x | x |   | x | x |

Example 4
| x |   | x | x |
|   |   |   |   |
| x |   | x | x |
| x |   | x | x |
| x |   | x | x |

Example 7
| x |   | x | x |
| x |   | x | x |
| x |   | x | x |

Example 8
| x | x |
|   |   |
| x | x |
| x | x |
| x | x |

Example 9
| x |   |   |   |
|   |   |   |   |
| x |   | x | x |
| x |   | x | x |
| x |   | x | x |

Example 10
| x |   | x | x |
|   |   |   |   |
|   |   | x | x |
|   |   | x | x |
|   |   | x | x |

Example 11
| x | x | x |
| x | x | x |
| x | x | x |

example 12
| x | x |

example 13
| x |
| x |

example 14
| x |
| x |


If this works, you (the parser) should be able to figure out the right thing to do in all cases, can you?
9/10 are pretty ugly and maybe we should just drop the name in that case
12/13/14 are ambiguous and so we need to decide on sane defaults.

This is pretty close to what you've already done, and also uses @changhiskhan 's idea of
sniffing things out. Most of the work would be in the parser like you mentioned.

once the parse can handle all parts being present/missing, it's fine to add
args to control [row,col]-index on/off, names on off.

Should integer ("default") indexes be dumped to the file? no? optional?

I would like merged cells to be optional (default on is fine by me), in case
they expose some unforseen issues in people's excel workflows (exporting to csv, VBA, etc').

There's a lot of work in making this work consistently.... diminishing returns?

@ghost
Copy link
Author

ghost commented Dec 12, 2012

related #2088

@ghost
Copy link
Author

ghost commented Dec 24, 2012

@locojay , are you claiming this or shall I slog through this?

@jassinm
Copy link

jassinm commented Dec 30, 2012

@y-p: I am not claiming it. If you have time to slog through it that would be great as i am busy at the moment and do not use the excel reading at all ...

@jreback
Copy link
Contributor

jreback commented Dec 18, 2013

@jtratner this can be closed? by various issues

@jtratner
Copy link
Contributor

this isn't completely closed at this point, we don't have a way to put multiple dataframes in the same sheet. We have covered most of the MI round-tripping I believe.

@ghost
Copy link
Author

ghost commented Jan 10, 2014

I'm comfortable closing this. The mi header expansion stuff has been added since as @jtratner
noted. The roundtrip issue is nice to have on general principle (and for tesing), that's
why I hemstringed that part of the changes in the original PR way back.

Closing, since it's stale. If a specific feature request is in order, open a fresh one.

@ghost ghost closed this as completed Jan 10, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Ideas Long-Term Enhancement Discussions Indexing Related to indexing on series/frames, not to indexes themselves IO Excel read_excel, to_excel
Projects
None yet
Development

No branches or pull requests

3 participants