-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Adding support for calamine as Excel reader engine #50395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This looks pretty cool, especially since it seems to be able to read all sorts of excel formats(we might be able to kill off the other engines, like odfpy and pyxlsb). I think a few weeks back, I profiled the Excel code, and most of the time seemed to be spent in openpyxl, as opposed to the parsing, so this'll definetely be an improvement. PRs are welcome. |
I'd be happy to explore implementing this. Just to be sure are we happy to include introduce another dependency for this engine? |
Yes, ideally we'd be able to deprecate of some of the other engines(e.g. pyxlsb, odfpy, xlrd), since calamine seems to support a lot more formats. |
Okay cool I'll have a look at 👍🏻 |
Co-author: Kostya Farber (see pandas-dev#50581)
Co-author: Kostya Farber (pandas-dev#50581)
Co-author: Kostya Farber (pandas-dev#50581)
Co-author: Kostya Farber (pandas-dev#50581)
Co-author: Kostya Farber (pandas-dev#50581)
Co-author: Kostya Farber (pandas-dev#50581)
Co-author: Kostya Farber (pandas-dev#50581)
Co-author: Kostya Farber (pandas-dev#50581)
pandas 2.2.0 pip install python-calamine why ValueError: Unknown engine: calamine |
Are there any plans for adding |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
Reading Excel files in Pandas is considerably slower than in some alternative data frame tools, for example the
readxl
package in R can read Excel files much faster. The Rustcalamine
library can read Excel files much faster than other engines supported by Pandas, and there is an existing Python binding to it, python-calamine. I would like to request that Pandas add official support forcalamine
, so that users can read an Excel file like:pd.read_excel("test.xlsx", engine="calamine")
Feature Description
The
python-calamine
package already implements code that enables the calamine engine in Pandas, see the examples usingpandas_monkeypatch()
at the bottom of their Github README. The code to enable this is hereAlthough
python-calamine
already implements the necessary features to use the library with Pandas, I am unclear on how similar the behavior is betweencalamine
and other engines that Pandas supports likeopenpyxl
. I am hoping that by bringingcalamine
in as an officially supported engine that Pandas unit tests will confirm consistent behavior acrosscalamine
and other engines.Alternative Solutions
None
Additional Context
No response
The text was updated successfully, but these errors were encountered: