New Excel functionality #2478

ghost · 2012-12-10T18:54:38Z

Initial implementation was #2370 by @locojay, which was very nice but suffered from
several corner-cases and ambiguity when the files were read back in.
f0aa065 was a stopgap measure to roll back the functionality so 0.10 was not delayed.
The enhancements should be brought back in 0.11, enahnced with parsing of
multiindex on both axes, support for index names everywhere, decisions on best
defaults, arguments and so on.

@locojay, are you still interested in completing this after a bit more discussion?

The text was updated successfully, but these errors were encountered:

jassinm · 2012-12-11T00:28:00Z

sure I will look into it.

The ExcelReader will need to be decoupled from the TextReader....

If we agree that all what get's dumped should be read in the same fashion in a routrip we are good to go.

for example dumping a Df with a Mutiindex Column will result in merged cells probably in multiple levels. These merged cells musst be present to be read the df back

ghost · 2012-12-11T05:03:16Z

@locojay, agreed that as much as possible should be recreated on read: names, labels, index structure.

Rather then relying on style information, I suggest using the following convention, which should
allow automagically inferring what information is present in the file:

Example 1
|   | x |   | x | x |
|   | x |   | x | x |
| x | x |   | x | x |
|   |   |   |   |   |
| x | x |   | x | x |
| x | x |   | x | x |
| x | x |   | x | x |

Example 2
|   |   |   | x | x |
|   |   |   | x | x |
| x | x |   | x | x |
|   |   |   |   |   |
| x | x |   | x | x |
| x | x |   | x | x |
| x | x |   | x | x |

Example 3
|   |   |   | x | x |
|   |   |   |   |   |
| x | x |   | x | x |
| x | x |   | x | x |
| x | x |   | x | x |

Example 4
| x |   | x | x |
|   |   |   |   |
| x |   | x | x |
| x |   | x | x |
| x |   | x | x |

Example 7
| x |   | x | x |
| x |   | x | x |
| x |   | x | x |

Example 8
| x | x |
|   |   |
| x | x |
| x | x |
| x | x |

Example 9
| x |   |   |   |
|   |   |   |   |
| x |   | x | x |
| x |   | x | x |
| x |   | x | x |

Example 10
| x |   | x | x |
|   |   |   |   |
|   |   | x | x |
|   |   | x | x |
|   |   | x | x |

Example 11
| x | x | x |
| x | x | x |
| x | x | x |

example 12
| x | x |

example 13
| x |
| x |

example 14
| x |
| x |

If this works, you (the parser) should be able to figure out the right thing to do in all cases, can you?
9/10 are pretty ugly and maybe we should just drop the name in that case
12/13/14 are ambiguous and so we need to decide on sane defaults.

This is pretty close to what you've already done, and also uses @changhiskhan 's idea of
sniffing things out. Most of the work would be in the parser like you mentioned.

once the parse can handle all parts being present/missing, it's fine to add
args to control [row,col]-index on/off, names on off.

Should integer ("default") indexes be dumped to the file? no? optional?

I would like merged cells to be optional (default on is fine by me), in case
they expose some unforseen issues in people's excel workflows (exporting to csv, VBA, etc').

There's a lot of work in making this work consistently.... diminishing returns?

ghost · 2012-12-12T14:50:54Z

related #2088

ghost · 2012-12-24T16:46:11Z

@locojay , are you claiming this or shall I slog through this?

jassinm · 2012-12-30T19:40:30Z

@y-p: I am not claiming it. If you have time to slog through it that would be great as i am busy at the moment and do not use the excel reading at all ...

jreback · 2013-12-18T20:06:44Z

@jtratner this can be closed? by various issues

jtratner · 2013-12-18T21:01:58Z

this isn't completely closed at this point, we don't have a way to put multiple dataframes in the same sheet. We have covered most of the MI round-tripping I believe.

ghost · 2014-01-10T12:13:03Z

I'm comfortable closing this. The mi header expansion stuff has been added since as @jtratner
noted. The roundtrip issue is nice to have on general principle (and for tesing), that's
why I hemstringed that part of the changes in the original PR way back.

Closing, since it's stale. If a specific feature request is in order, open a fresh one.

ghost mentioned this issue Dec 10, 2012

New Excel changes cause an extra line to be generated in the Excel file #2396

Closed

ghost mentioned this issue Mar 22, 2013

multiindex column in to_excel #2701

Closed

ghost mentioned this issue May 18, 2013

ENH: allow to_csv to write multi-index columns, read_csv to read with header=list arg #3575

Merged

ghost closed this as completed Jan 10, 2014

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Excel functionality #2478

New Excel functionality #2478

ghost commented Dec 10, 2012

jassinm commented Dec 11, 2012

ghost commented Dec 11, 2012

ghost commented Dec 12, 2012

ghost commented Dec 24, 2012

jassinm commented Dec 30, 2012

jreback commented Dec 18, 2013

jtratner commented Dec 18, 2013

ghost commented Jan 10, 2014

New Excel functionality #2478

New Excel functionality #2478

Comments

ghost commented Dec 10, 2012

jassinm commented Dec 11, 2012

ghost commented Dec 11, 2012

ghost commented Dec 12, 2012

ghost commented Dec 24, 2012

jassinm commented Dec 30, 2012

jreback commented Dec 18, 2013

jtratner commented Dec 18, 2013

ghost commented Jan 10, 2014