-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Write Excel Round-Trippable Timedeltas Regression #21502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This would have been a result of #19900 Are you saying you were getting those objects back as Timedelta objects or Datetimes before? I think the main motivator to get rid of this in the first place was that it not only was inconsistent across the engines but would implicitly convert a Timedelta to a Datetime in the round-trip process, neither I would think are desirable |
It looks like before they were object columns with datetime.time instances. These just got printed the same as timedeltas.
Is it possible to make the original timedelta recoverable? Found this to convert back, but it requires manually tracking the timedelta columns:
|
Yea I think that speaks to the danger that was present with the previous implementation - you could very easily mistake a Timedelta for a Time object and they are not the same. I'm going to close this under the assumption that a lossless conversion of Timedelta information to/from Excel simply is not possible because Excel does not support Timedeltas natively. If you disagree and can think of a way to do that feel free to reopen - investigation and PRs are always welcome! To answer your last question, you should be able to construct a Timedelta from a float value directly. For specific usage questions however, you are better asking on Stack Overflow both to get an answer and help others out in the future that may have the same question (we typically reserve this tracker for issues and enhancement requests) |
Ok, thank you. I didn't know if there was a way to use metadata to track which columns were timedeltas. |
Code Sample, a copy-pastable example if possible
Problem description
Writing a Dataframe with timedeltas to excel using the openpyxl engine has changed behaviour from pandas version 0.22 to 0.23.1. Previously, the excel file produced could be read in as a Dataframe again with values matching the original Dataframe. If the xlsxwriter engine or pandas 0.23.1 is used, the timedeltas become floats in the re-read Dataframe.
Expected Output
One the one hand, it's nice that pandas now has consistent behaviour between engines, however I think the desired behaviour would be for both engines to match the old openpyxl behaviour rather than the old xlsxwriter behaviour, as this would enable Dataframes written to excel to be used again when read back from excel.
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8
pandas: 0.22.0 or 0.23.1
pytest: 3.6.1
pip: 10.0.1
setuptools: 39.1.0
Cython: None
numpy: 1.14.5
scipy: 1.1.0
pyarrow: None
xarray: 0.10.3
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.4
blosc: None
bottleneck: None
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.4
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.5
lxml: None
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.2.8
pymysql: 0.8.1
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: