Skip to content

BUG: read_csv interpreting NA value as comment when NA contains comment string #38392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 13, 2020

Conversation

phofl
Copy link
Member

@phofl phofl commented Dec 9, 2020

I looked into the c case, but this is not that trivial as here. Comments are removed way before checking for na values. Would have to move that part way down to be able to do this.

@phofl phofl added the IO CSV read_csv, to_csv label Dec 9, 2020
Copy link
Member

@ivanovmg ivanovmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, though I suggest more variety in the input data (rather than all 1, all 2, etc).

Comment on lines 324 to 329
data = (
"# this is a comment\n"
"1,2,3,4\n"
"1,2,3,4#inline comment\n"
"1,2#,3,4\n"
"1,2,#N/A,4\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have the first row (not the commented one) to be like ``col_1,col_2,col_3,col4\n" to avoid confusion with the values below it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done

@jreback
Copy link
Contributor

jreback commented Dec 12, 2020

looks fine here. can you merge master.

cc @gfyoung if any comments.

@jreback jreback added this to the 1.3 milestone Dec 12, 2020
@jreback
Copy link
Contributor

jreback commented Dec 13, 2020

can u merge master and ping on green

@phofl
Copy link
Member Author

phofl commented Dec 13, 2020

@jreback green

@jreback jreback merged commit 61704da into pandas-dev:master Dec 13, 2020
@jreback
Copy link
Contributor

jreback commented Dec 13, 2020

thanks @phofl

@phofl phofl deleted the 34002 branch December 14, 2020 08:21
luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021
…nt string (pandas-dev#38392)

* BUG: read_csv not converting to float for python engine with decimal sep, usecols and parse_dates

* Fix comment issues for python parser

* Add test

* Add whatsnew

* Revert "BUG: read_csv not converting to float for python engine with decimal sep, usecols and parse_dates"

This reverts commit 8c2e1ca

* Commit merge conflict

* Improve test

* Remove import

* Add c tests

* Remove function input

* Improve note

Co-authored-by: Jeff Reback <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants