-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
DOC: Update docstring for read_excel #56543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
548c16e
0d17d62
7a3f09e
411eda7
21d88bf
b72d32b
c39f154
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -165,31 +165,12 @@ | |
Supported engines: "xlrd", "openpyxl", "odf", "pyxlsb", "calamine". | ||
Engine compatibility : | ||
|
||
- ``xlr`` supports old-style Excel files (.xls). | ||
- ``openpyxl`` supports newer Excel file formats. | ||
- ``odf`` supports OpenDocument file formats (.odf, .ods, .odt). | ||
- ``pyxlsb`` supports Binary Excel files. | ||
- ``calamine`` supports Excel (.xls, .xlsx, .xlsm, .xlsb) | ||
and OpenDocument (.ods) file formats. | ||
|
||
.. versionchanged:: 1.2.0 | ||
The engine `xlrd <https://xlrd.readthedocs.io/en/latest/>`_ | ||
now only supports old-style ``.xls`` files. | ||
When ``engine=None``, the following logic will be | ||
used to determine the engine: | ||
Comment on lines
-178
to
-179
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do think it's good to have this logic documented, can you just move it out of the versionchanged instead? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIRC xlrd used to not only support xlsx files but at one point was even the default so we had to go through some lengths to document that transition as clearly as possible. We are a few years removed from that and since then all default read libraries have specialized in a given extension(s), so I think we can do without going into this detail in the docstring There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Where are the default read libraries documented? E.g. both openpyxl and calamine can read xlsx, and both pyxlsb and calamine can read xlsb. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Calamine is never the default - you'd have to explicitly use that as an engine. Otherwise this is documented in the Excel section of the IO manual https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#excel-files There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Granted that section could be rewritten to be a little clearer, but I think that is out of scope for what @phofl is doing here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is true today, but there is an issue to make it the default for xlsb files. In any case, I don't believe it's documented that Calamine is never the default. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated and added back in without the version changed |
||
|
||
- If ``path_or_buffer`` is an OpenDocument format (.odf, .ods, .odt), | ||
then `odf <https://pypi.org/project/odfpy/>`_ will be used. | ||
- Otherwise if ``path_or_buffer`` is an xls format, | ||
``xlrd`` will be used. | ||
- Otherwise if ``path_or_buffer`` is in xlsb format, | ||
``pyxlsb`` will be used. | ||
|
||
.. versionadded:: 1.3.0 | ||
- Otherwise ``openpyxl`` will be used. | ||
|
||
.. versionchanged:: 1.3.0 | ||
|
||
- ``odf`` supports OpenDocument file formats (.odf, .ods, .odt). | ||
- ``pyxlsb`` supports Binary Excel files. | ||
- ``xlrd`` supports old-style Excel files (.xls). | ||
converters : dict, default None | ||
Dict of functions for converting values in certain columns. Keys can | ||
either be integers or column labels, values are functions that take one | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope, but it appears to me this line is duplicative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this line