Skip to content

Commit bd3cd37

Browse files
DOC: update wording about when xlrd engine can be used (#38456)
Co-authored-by: Joris Van den Bossche <[email protected]>
1 parent 936d125 commit bd3cd37

File tree

3 files changed

+56
-28
lines changed

3 files changed

+56
-28
lines changed

doc/source/user_guide/io.rst

+28-3
Original file line numberDiff line numberDiff line change
@@ -2834,15 +2834,40 @@ parse HTML tables in the top-level pandas io function ``read_html``.
28342834
Excel files
28352835
-----------
28362836

2837-
The :func:`~pandas.read_excel` method can read Excel 2003 (``.xls``)
2838-
files using the ``xlrd`` Python module. Excel 2007+ (``.xlsx``) files
2839-
can be read using either ``xlrd`` or ``openpyxl``. Binary Excel (``.xlsb``)
2837+
The :func:`~pandas.read_excel` method can read Excel 2007+ (``.xlsx``) files
2838+
using the ``openpyxl`` Python module. Excel 2003 (``.xls``) files
2839+
can be read using ``xlrd``. Binary Excel (``.xlsb``)
28402840
files can be read using ``pyxlsb``.
28412841
The :meth:`~DataFrame.to_excel` instance method is used for
28422842
saving a ``DataFrame`` to Excel. Generally the semantics are
28432843
similar to working with :ref:`csv<io.read_csv_table>` data.
28442844
See the :ref:`cookbook<cookbook.excel>` for some advanced strategies.
28452845

2846+
.. warning::
2847+
2848+
The `xlwt <https://xlwt.readthedocs.io/en/latest/>`__ package for writing old-style ``.xls``
2849+
excel files is no longer maintained.
2850+
The `xlrd <https://xlrd.readthedocs.io/en/latest/>`__ package is now only for reading
2851+
old-style ``.xls`` files.
2852+
2853+
Previously, the default argument ``engine=None`` to :func:`~pandas.read_excel`
2854+
would result in using the ``xlrd`` engine in many cases, including new
2855+
Excel 2007+ (``.xlsx``) files.
2856+
If `openpyxl <https://openpyxl.readthedocs.io/en/stable/>`__ is installed,
2857+
many of these cases will now default to using the ``openpyxl`` engine.
2858+
See the :func:`read_excel` documentation for more details.
2859+
2860+
Thus, it is strongly encouraged to install ``openpyxl`` to read Excel 2007+
2861+
(``.xlsx``) files.
2862+
**Please do not report issues when using ``xlrd`` to read ``.xlsx`` files.**
2863+
This is no longer supported, switch to using ``openpyxl`` instead.
2864+
2865+
Attempting to use the the ``xlwt`` engine will raise a ``FutureWarning``
2866+
unless the option :attr:`io.excel.xls.writer` is set to ``"xlwt"``.
2867+
While this option is now deprecated and will also raise a ``FutureWarning``,
2868+
it can be globally set and the warning suppressed. Users are recommended to
2869+
write ``.xlsx`` files using the ``openpyxl`` engine instead.
2870+
28462871
.. _io.excel_reader:
28472872

28482873
Reading Excel files

doc/source/whatsnew/v1.2.0.rst

+15-14
Original file line numberDiff line numberDiff line change
@@ -10,21 +10,22 @@ including other versions of pandas.
1010

1111
.. warning::
1212

13-
The packages `xlrd <https://xlrd.readthedocs.io/en/latest/>`_ for reading excel
14-
files and `xlwt <https://xlwt.readthedocs.io/en/latest/>`_ for
15-
writing excel files are no longer maintained. These are the only engines in pandas
16-
that support the xls format.
17-
18-
Previously, the default argument ``engine=None`` to ``pd.read_excel``
19-
would result in using the ``xlrd`` engine in many cases. If
20-
`openpyxl <https://openpyxl.readthedocs.io/en/stable/>`_ is installed,
13+
The `xlwt <https://xlwt.readthedocs.io/en/latest/>`_ package for writing old-style ``.xls``
14+
excel files is no longer maintained.
15+
The `xlrd <https://xlrd.readthedocs.io/en/latest/>`_ package is now only for reading
16+
old-style ``.xls`` files.
17+
18+
Previously, the default argument ``engine=None`` to :func:`~pandas.read_excel`
19+
would result in using the ``xlrd`` engine in many cases, including new
20+
Excel 2007+ (``.xlsx``) files.
21+
If `openpyxl <https://openpyxl.readthedocs.io/en/stable/>`_ is installed,
2122
many of these cases will now default to using the ``openpyxl`` engine.
22-
See the :func:`read_excel` documentation for more details. Attempting to read
23-
``.xls`` files or specifying ``engine="xlrd"`` to ``pd.read_excel`` will not
24-
raise a warning. However users should be aware that ``xlrd`` is already
25-
broken with certain package configurations, for example with Python 3.9
26-
when `defusedxml <https://github.com/tiran/defusedxml/>`_ is installed, and
27-
is anticipated to be unusable in the future.
23+
See the :func:`read_excel` documentation for more details.
24+
25+
Thus, it is strongly encouraged to install ``openpyxl`` to read Excel 2007+
26+
(``.xlsx``) files.
27+
**Please do not report issues when using ``xlrd`` to read ``.xlsx`` files.**
28+
This is no longer supported, switch to using ``openpyxl`` instead.
2829

2930
Attempting to use the the ``xlwt`` engine will raise a ``FutureWarning``
3031
unless the option :attr:`io.excel.xls.writer` is set to ``"xlwt"``.

pandas/io/excel/_base.py

+13-11
Original file line numberDiff line numberDiff line change
@@ -105,16 +105,16 @@
105105
Supported engines: "xlrd", "openpyxl", "odf", "pyxlsb".
106106
Engine compatibility :
107107
108-
- "xlrd" supports most old/new Excel file formats.
108+
- "xlrd" supports old-style Excel files (.xls).
109109
- "openpyxl" supports newer Excel file formats.
110110
- "odf" supports OpenDocument file formats (.odf, .ods, .odt).
111111
- "pyxlsb" supports Binary Excel files.
112112
113113
.. versionchanged:: 1.2.0
114114
The engine `xlrd <https://xlrd.readthedocs.io/en/latest/>`_
115-
is no longer maintained, and is not supported with
116-
python >= 3.9. When ``engine=None``, the following logic will be
117-
used to determine the engine.
115+
now only supports old-style ``.xls`` files.
116+
When ``engine=None``, the following logic will be
117+
used to determine the engine:
118118
119119
- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt),
120120
then `odf <https://pypi.org/project/odfpy/>`_ will be used.
@@ -920,7 +920,7 @@ class ExcelFile:
920920
"""
921921
Class for parsing tabular excel sheets into DataFrame objects.
922922
923-
Uses xlrd engine by default. See read_excel for more documentation
923+
See read_excel for more documentation
924924
925925
Parameters
926926
----------
@@ -933,17 +933,17 @@ class ExcelFile:
933933
Supported engines: ``xlrd``, ``openpyxl``, ``odf``, ``pyxlsb``
934934
Engine compatibility :
935935
936-
- ``xlrd`` supports most old/new Excel file formats.
936+
- ``xlrd`` supports old-style Excel files (.xls).
937937
- ``openpyxl`` supports newer Excel file formats.
938938
- ``odf`` supports OpenDocument file formats (.odf, .ods, .odt).
939939
- ``pyxlsb`` supports Binary Excel files.
940940
941941
.. versionchanged:: 1.2.0
942942
943943
The engine `xlrd <https://xlrd.readthedocs.io/en/latest/>`_
944-
is no longer maintained, and is not supported with
945-
python >= 3.9. When ``engine=None``, the following logic will be
946-
used to determine the engine.
944+
now only supports old-style ``.xls`` files.
945+
When ``engine=None``, the following logic will be
946+
used to determine the engine:
947947
948948
- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt),
949949
then `odf <https://pypi.org/project/odfpy/>`_ will be used.
@@ -954,8 +954,10 @@ class ExcelFile:
954954
then ``openpyxl`` will be used.
955955
- Otherwise ``xlrd`` will be used and a ``FutureWarning`` will be raised.
956956
957-
Specifying ``engine="xlrd"`` will continue to be allowed for the
958-
indefinite future.
957+
.. warning::
958+
959+
Please do not report issues when using ``xlrd`` to read ``.xlsx`` files.
960+
This is not supported, switch to using ``openpyxl`` instead.
959961
"""
960962

961963
from pandas.io.excel._odfreader import ODFReader

0 commit comments

Comments
 (0)