Skip to content

BUG: Roundtrip with openpyxl and datetime precision #38644

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rhshadrach opened this issue Dec 22, 2020 · 3 comments
Closed

BUG: Roundtrip with openpyxl and datetime precision #38644

rhshadrach opened this issue Dec 22, 2020 · 3 comments
Labels
Closing Candidate May be closeable, needs more eyeballs IO Excel read_excel, to_excel Upstream issue Issue related to pandas dependency

Comments

@rhshadrach
Copy link
Member

The test tests.io.excel.test_writers.test_datetimes fails with openpyxl (now the default, xref #35029):

path = "test.xlsx"
datetimes = [
    datetime(2013, 1, 13, 1, 2, 3),
    datetime(2013, 1, 13, 2, 45, 56),
    datetime(2013, 1, 13, 4, 29, 49),  # <---
    datetime(2013, 1, 13, 6, 13, 42),
    datetime(2013, 1, 13, 7, 57, 35),  # <---
    datetime(2013, 1, 13, 9, 41, 28),
    datetime(2013, 1, 13, 11, 25, 21),
    datetime(2013, 1, 13, 13, 9, 14),
    datetime(2013, 1, 13, 14, 53, 7),
    datetime(2013, 1, 13, 16, 37, 0),
    datetime(2013, 1, 13, 18, 20, 52),
]
write_frame = DataFrame({"A": datetimes})
write_frame.to_excel(path, "Sheet1")
read_frame = pd.read_excel(path, sheet_name="Sheet1", header=0, engine="openpyxl")
tm.assert_series_equal(write_frame["A"], read_frame["A"])

The highlighted lines give

 2013-01-13 04:29:48.999999
 2013-01-13 07:57:35.000001

With these two lines removed, the results are of the form e.g. 2013-01-13 11:25:21 and the test passes.

@rhshadrach rhshadrach added Bug Needs Triage Issue that has not been reviewed by a pandas team member IO Excel read_excel, to_excel and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 22, 2020
@rhshadrach rhshadrach added this to the Contributions Welcome milestone Dec 22, 2020
@rhshadrach
Copy link
Member Author

This is an excel issue - datetimes are stored as floats. It seems odf/xlrd have logic to handle precision issues encountered whereas openpyxl does not. We could implement something similar here, but it seems best to me leave as-is.

https://stackoverflow.com/questions/57530408/python-reads-inaccurately-from-excel-date-time-field

@rhshadrach rhshadrach added the Closing Candidate May be closeable, needs more eyeballs label Feb 13, 2021
@pgp
Copy link

pgp commented Mar 1, 2021

Related issue on openpyxl:
https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1630

@mroeschke
Copy link
Member

Agreed looks like an upstream issue and appears to have a related bug report. Closing

@mroeschke mroeschke added Upstream issue Issue related to pandas dependency and removed Bug labels Aug 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs IO Excel read_excel, to_excel Upstream issue Issue related to pandas dependency
Projects
None yet
Development

No branches or pull requests

3 participants