Skip to content

read_excel na_values replacement after parse_dates #26203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nylocx opened this issue Apr 24, 2019 · 3 comments · Fixed by #44620
Closed

read_excel na_values replacement after parse_dates #26203

nylocx opened this issue Apr 24, 2019 · 3 comments · Fixed by #44620
Labels
Bug IO Excel read_excel, to_excel
Milestone

Comments

@nylocx
Copy link

nylocx commented Apr 24, 2019

### Not working
pd.read_excel(
    "test.xlsx",
    na_values={"Test": ['#', 0]},
    parse_dates=["Test"],
    date_parser=lambda x: pd.to_datetime(x, format="%Y-%m-%d"),
)
### Working
pd.read_csv(
    "test.txt",
    na_values={"Test": ['#', 0]},
    parse_dates=["Test"],
    date_parser=lambda x: pd.to_datetime(x, format="%Y-%m-%d"),
)

read_excel behaves different from read_csv in replacing NA values and parsing dates.

Test case:
The test.txt and test.xlsx contain the same data, just one column with header "Test" and 5 entries where 0 and # both represent NA values.

Test
2012-10-01
0
2015-05-15
#
2017-09-09

The first one crashes while trying to parse a "#" character as date.

@WillAyd
Copy link
Member

WillAyd commented Apr 24, 2019

Can you provide the sample files?

@WillAyd WillAyd added IO Excel read_excel, to_excel Needs Info Clarification about behavior needed to assess issue labels Apr 24, 2019
@nylocx
Copy link
Author

nylocx commented Apr 25, 2019

Sure, sorry that I forgot this.
I couldn't upload the CSV with a .csv extension because github prevents me from doing so.
(We don’t support that file type.
with a GIF, JPEG, JPG, PNG, DOCX, GZ, LOG, PDF, PPTX, TXT, XLSX or ZIP.)
So its a test.txt instead. I will edit my report to reflect this.

test.txt
test.xlsx

@WillAyd
Copy link
Member

WillAyd commented May 10, 2019

Sorry just seeing this now but thanks for providing the sample files. That does indeed look buggy.

Investigation and PRs are always welcome

@WillAyd WillAyd added Bug and removed Needs Info Clarification about behavior needed to assess issue labels May 10, 2019
@WillAyd WillAyd added this to the Contributions Welcome milestone May 10, 2019
@jreback jreback modified the milestones: Contributions Welcome, 1.4 Nov 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Excel read_excel, to_excel
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants