Skip to content

ENH: Adding engine_kwargs to DataFrame.to_excel #53220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
May 20, 2023
Merged

ENH: Adding engine_kwargs to DataFrame.to_excel #53220

merged 25 commits into from
May 20, 2023

Conversation

rmhowe425
Copy link
Contributor

@rmhowe425 rmhowe425 commented May 14, 2023

@rmhowe425
Copy link
Contributor Author

rmhowe425 commented May 14, 2023

@rhshadrach Pinging for review since the CI tests aren't all green. Looks like one test failed, and the failure was not related to my enhancement or unit tests.

@@ -3785,6 +3785,15 @@ one can pass an :class:`~pandas.io.excel.ExcelWriter`.

.. _io.excel_writing_buffer:

When using the ``engine_kwargs`` parameter, pandas will pass these arguments to the
engine.For this, it is important to know which function pandas is using internally.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
engine.For this, it is important to know which function pandas is using internally.
engine. For this, it is important to know which function pandas is using internally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed with most recent push

@@ -83,7 +83,7 @@ Other enhancements
- :meth:`arrays.DatetimeArray.map`, :meth:`arrays.TimedeltaArray.map` and :meth:`arrays.PeriodArray.map` can now take a ``na_action`` argument (:issue:`51644`)
- :meth:`arrays.SparseArray.map` now supports ``na_action`` (:issue:`52096`).
- Add :meth:`diff()` and :meth:`round()` for :class:`Index` (:issue:`19708`)
- Add dtype of categories to ``repr`` information of :class:`CategoricalDtype` (:issue:`52179`)
- Add dtype of categories to ``repr`` information of :class:`CategoricalDtype` (:issue:`52179`)l
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Add dtype of categories to ``repr`` information of :class:`CategoricalDtype` (:issue:`52179`)l
- Add dtype of categories to ``repr`` information of :class:`CategoricalDtype` (:issue:`52179`)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed with most recent push

@@ -97,8 +97,10 @@ Other enhancements
- Let :meth:`DataFrame.to_feather` accept a non-default :class:`Index` and non-string column names (:issue:`51787`)
- Performance improvement in :func:`read_csv` (:issue:`52632`) with ``engine="c"``
- :meth:`Categorical.from_codes` has gotten a ``validate`` parameter (:issue:`50975`)
- Adding ``engine_kwargs`` parameter to :meth:`DataFrame.to_excel` (:issue:`53220`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Adding ``engine_kwargs`` parameter to :meth:`DataFrame.to_excel` (:issue:`53220`)
- Added ``engine_kwargs`` parameter to :meth:`DataFrame.to_excel` (:issue:`53220`)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed with most recent push

self._book = Workbook(self._handles.handle, **engine_kwargs)
try:
self._book = Workbook(self._handles.handle, **engine_kwargs)
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except:
except TypeError:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed with most recent push

@mroeschke mroeschke added the IO Excel read_excel, to_excel label May 15, 2023
@rhshadrach rhshadrach added Enhancement API - Consistency Internal Consistency of API/Behavior and removed API - Consistency Internal Consistency of API/Behavior labels May 16, 2023
@rhshadrach rhshadrach added this to the 2.1 milestone May 16, 2023
@rmhowe425 rmhowe425 requested a review from rhshadrach May 17, 2023 00:14
@rmhowe425 rmhowe425 requested a review from mroeschke May 17, 2023 04:04
Comment on lines 1126 to 1131
"xlsxwriter": r"__init__() got an unexpected keyword argument 'foo'",
}

# Handle change in error message for openpyxl (write and append mode)
if engine == "openpyxl" and os.path.exists(path):
msgs["openpyxl"] = r"__init__() got an unexpected keyword argument 'foo'"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For xlsxwriter and the append mode of openpyxl, I would not expect this error message. It appears that foo is attempting to be passed to OpenpyxlWriter.__init__ rather than stored in engine_kwargs. Is that the case?

Copy link
Contributor Author

@rmhowe425 rmhowe425 May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhshadrach For openpyxl.load_workbook, we are getting the correct error message in my unit test of load_workbook() got an unexpected keyword argument 'foo'

For xlsxwriter (there is no append mode) and the write mode of openpyxl, both methods are passed to <engine>.Workbook().

238488486-9a5a8058-aae1-4f2f-936d-fc04d01346f9

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhshadrach Yeah looking at the workbook class being instantiated in both _xlsxwriter.py and _openpyxl.py, the error message handling should be correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see - the __init__ that raises is part of the engine. Can you check the message for e.g. Workbook.__init__() to make sure we're not raising in the wrong place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the traceback was changed in Python 3.10; in Python 3.9 you only get __init__().... Can you import PY310 from pandas.compat._constants and check for Workbook.__init__ when true? Note this will be true whenever the Python version is 3.10 or greater.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rhshadrach! I was going down a rabbit hole last night troubleshooting this. I was beginning to think that the issue had something to do with how we were raising the Type error in _openpyxl.py and _xlsxwriter.py without an Exception clause.

@rmhowe425
Copy link
Contributor Author

rmhowe425 commented May 19, 2023

@rhshadrach @mroeschke I believe the PR should be good to go at this point. Looks like the test failure for test-arm is unrelated to my push. While writing this comment, it appears that the test is in a maintenance state.

@rmhowe425 rmhowe425 requested a review from rhshadrach May 19, 2023 15:50
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merge when ready @rhshadrach

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rhshadrach rhshadrach merged commit df8b686 into pandas-dev:main May 20, 2023
@rhshadrach
Copy link
Member

Thanks @rmhowe425!

topper-123 pushed a commit to topper-123/pandas that referenced this pull request May 22, 2023
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: add engine_kwargs to DataFrame.to_excel
3 participants