-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Pass through Engine kwargs in ExcelWriter #43445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor request below. Also some CI failures related to the docstrings.
I had 6 times in >>> with pd.ExcelWriter(
... "path_to_file.xlsx",
... engine="openpyxl",
... mode="a",
... engine_kwargs={"keep_vba": True}
... ) as writer:
... df.to_excel(writer) had that |
Thanks @joeperdefloep and apologies for the delay. Docs look good (addition of
|
@joeperdefloep can you merge master and fixup according to @rhshadrach comments |
@joeperdefloep @alimcmaster1 I am willing to finish this one up, if you guys don't have the time, since we need this in some of our projects. |
Hello everyone. Apologies for the delays That's funny... It passed on my machine.... This is the error:
I just changed the tests. Should be good now. ARM seems to be failing, but that is quite the black box to me... |
I now get a lot of the following errors:
seems to be related to this issue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, some minor requests on the tests.
@pytest.mark.parametrize( | ||
"engine_kwargs", [{"data_only": True}, {"keep_vba": True}, {"keep_links": False}] | ||
) | ||
def test_engine_kwargs_append(ext, engine_kwargs): | ||
# GH 43445 | ||
# only read_only=True will give something easy that can be verified (maybe | ||
# keep_links or data_only could also work, but that'd be more complicated) | ||
# arguments are passed through to load_workbook() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think read_only would be sufficient. Need to verify the kwarg gets through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought testing for all available parameters would be nice, since then we will also be warned when API changes on openpyxl's side...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, read-only will not work here, and give different errors on windows than linux, this is why the previous test failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding the test correctly, the passing of kwargs through to openpyxl could be entirely removed from pandas and this test would pass. Let me know if you're not sure of a way to improve that (or if my assessment is wrong).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, that's true. That's why I have the other test: test_engine_kwargs_append_invalid
that tests for an invalid keyword argument in load_workbook
. Also the other test test_engine_kwargs_append_data_only
tests for correct use of openpyxl's data_only
being passed through. That would make this test redundant indeed? It just felt logical to first test if passing an engine kwarg does not raise an error normally, then test if it does raise with an invalid argument and then test if the passing of engine kwargs works well together with openpyxl. Then in that order, people will know more precisely where the error lies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell me if you want it removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the summary, very helpful. Yes, I think this test can be removed as it's redundant with test_engine_kwargs_append_data_only
. In general, we'd rather not have to change tests if the API of a third party engine changes and it doesn't negatively impact pandas. It is okay to do so when necessary, as is in the data_only test.
@jreback friendly ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, can you resolve conflict.
thanks @joeperdefloep very nice! |
engine_kwargs
not passed through to other engines than xlsxwriter #42292 #42214 #43440 #43442engine_kwargs
#42292 #42214 #43440This is the revised version of #42214. In fact, in xlsxwriter, all kwargs are passed like so:
which then becomes internally:
so my initial fix was wrong in that sense, because I made
engine_kwargs
able to be passed in like this:and the resulted internals:
so #42214 was actually wrong (we want users to actually pass in kwargs.
However, there was still the issue of #43442, where the following engines do have some options that were silently ignored:
iso_dates
andwrite_only
style_compression
(not sure why anybody would want this, but people may)options
so I added tests for that in
engine_kwargs
.