Skip to content

Backport PR #56543 on branch 2.2.x (DOC: Update docstring for read_excel) #56730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 7 additions & 12 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3471,20 +3471,15 @@ saving a ``DataFrame`` to Excel. Generally the semantics are
similar to working with :ref:`csv<io.read_csv_table>` data.
See the :ref:`cookbook<cookbook.excel>` for some advanced strategies.

.. warning::

The `xlrd <https://xlrd.readthedocs.io/en/latest/>`__ package is now only for reading
old-style ``.xls`` files.
.. note::

Before pandas 1.3.0, the default argument ``engine=None`` to :func:`~pandas.read_excel`
would result in using the ``xlrd`` engine in many cases, including new
Excel 2007+ (``.xlsx``) files. pandas will now default to using the
`openpyxl <https://openpyxl.readthedocs.io/en/stable/>`__ engine.
When ``engine=None``, the following logic will be used to determine the engine:

It is strongly encouraged to install ``openpyxl`` to read Excel 2007+
(``.xlsx``) files.
**Please do not report issues when using ``xlrd`` to read ``.xlsx`` files.**
This is no longer supported, switch to using ``openpyxl`` instead.
- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt),
then `odf <https://pypi.org/project/odfpy/>`_ will be used.
- Otherwise if ``path_or_buffer`` is an xls format, ``xlrd`` will be used.
- Otherwise if ``path_or_buffer`` is in xlsb format, ``pyxlsb`` will be used.
- Otherwise ``openpyxl`` will be used.

.. _io.excel_reader:

Expand Down
32 changes: 10 additions & 22 deletions pandas/io/excel/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,36 +160,24 @@
If converters are specified, they will be applied INSTEAD
of dtype conversion.
If you use ``None``, it will infer the dtype of each column based on the data.
engine : str, default None
engine : {{'openpyxl', 'calamine', 'odf', 'pyxlsb', 'xlrd'}}, default None
If io is not a buffer or path, this must be set to identify io.
Supported engines: "xlrd", "openpyxl", "odf", "pyxlsb", "calamine".
Engine compatibility :

- ``xlr`` supports old-style Excel files (.xls).
- ``openpyxl`` supports newer Excel file formats.
- ``odf`` supports OpenDocument file formats (.odf, .ods, .odt).
- ``pyxlsb`` supports Binary Excel files.
- ``calamine`` supports Excel (.xls, .xlsx, .xlsm, .xlsb)
and OpenDocument (.ods) file formats.
- ``odf`` supports OpenDocument file formats (.odf, .ods, .odt).
- ``pyxlsb`` supports Binary Excel files.
- ``xlrd`` supports old-style Excel files (.xls).

.. versionchanged:: 1.2.0
The engine `xlrd <https://xlrd.readthedocs.io/en/latest/>`_
now only supports old-style ``.xls`` files.
When ``engine=None``, the following logic will be
used to determine the engine:

- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt),
then `odf <https://pypi.org/project/odfpy/>`_ will be used.
- Otherwise if ``path_or_buffer`` is an xls format,
``xlrd`` will be used.
- Otherwise if ``path_or_buffer`` is in xlsb format,
``pyxlsb`` will be used.

.. versionadded:: 1.3.0
- Otherwise ``openpyxl`` will be used.

.. versionchanged:: 1.3.0
When ``engine=None``, the following logic will be used to determine the engine:

- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt),
then `odf <https://pypi.org/project/odfpy/>`_ will be used.
- Otherwise if ``path_or_buffer`` is an xls format, ``xlrd`` will be used.
- Otherwise if ``path_or_buffer`` is in xlsb format, ``pyxlsb`` will be used.
- Otherwise ``openpyxl`` will be used.
converters : dict, default None
Dict of functions for converting values in certain columns. Keys can
either be integers or column labels, values are functions that take one
Expand Down