-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
to_csv writes wrong with NaN value #18676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
this does look like a bug. @gfyoung ? |
Agreed. I can also replicate this on |
I am working on this. This seems not the encoding issue. df.to_csv("df.csv",header=None,index=None) [pandas] cat df.csv 14:54:23 ☁ add-tuple-list-factorize-test ☂ ✭
""
1.0
2.0 |
Also, this returns the right result.
|
@gfyoung Seems that the bug or the default spec. in Case 1Input: import csv
fp = open('test.csv', 'w')
w = csv.writer(fp, dialect=csv.excel)
w.writerow(['1'])
w.writerow([''])
fp.close() Output: 1
Case 2Input: import csv
fp = open('test.csv', 'w')
w = csv.writer(fp, dialect=csv.excel)
w.writerow([''])
w.writerow(['1'])
fp.close() Output: ""
1 |
But this works. import csv
fp = open('test.csv', 'w')
w = csv.writer(fp, dialect=None)
w.writerow(['', '1'])
w.writerow(['3', '2'])
fp.close() ,1
3,2 |
The I don't know why this is needed. |
@Licht-T : Ah! That's very good to know. Okay, this means that this issue is out of the control of the @jackasser : Looks like you may have hit upon a point of contention in Python's CSV library. I would raise this issue in their library by submitting an issue on their python.org website. |
Closing because this is out of |
@gfyoung this can be fixed on pandas side doesn’t passing dialect=None work |
@jreback : Here's the code where we initialize our writer: https://github.com/pandas-dev/pandas/blob/master/pandas/io/formats/format.py#L1644-L1656
|
I suppose we could hack our away around this by checking for an empty first row before writing it to CSV and replace it with a space for example, though again as I said, very hackish IMO. |
try passing |
@jreback I already tried, but the result is same. import csv
fp = open('test.csv', 'w')
w = csv.writer(fp, dialect=None)
w.writerow([''])
w.writerow(['1'])
fp.close() ""
1 |
The only parameter that makes some impact is import csv
fp = open('test.csv', 'w')
w = csv.writer(fp, dialect=None, quoting=csv.QUOTE_NONE)
w.writerow([''])
w.writerow(['1'])
fp.close() ---------------------------------------------------------------------------
Error Traceback (most recent call last)
<ipython-input-56-97051671206d> in <module>()
2 fp = open('test.csv', 'w')
3 w = csv.writer(fp, dialect=csv.excel,quoting=csv.QUOTE_NONE)
----> 4 w.writerow([''])
5 w.writerow(['1'])
6 fp.close()
Error: single empty field record must be quoted |
Well..., this |
@Licht-T certainly can file a bug report there. ok I guess no easy way to fix this here, however, maybe we should add a small note in the code about this? |
@jreback Created the issue on CPython. I'll add the small note on pandas. |
@Licht-T thank you for creating issue! Line 1522 in ba3a442 ↑ I think here. ↓ and this page http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html?highlight=to_csv#pandas.DataFrame.to_csv |
@jackasser @jreback @gfyoung Actually, the double quoted blank field is the default spec. when writing single column CSV. IOW, this issue is the correct behavior, but the "Case 1" in #18676 (comment) was wrong behavior. This is now fixed in CPython and the patch is backported to CPython 3.6. Please note that this bug does not exist in CPython 2.7. |
@Licht-T ok can you add a test for >= 3.6 only, and xfail it for now (as not sure which release its on, though maybe its actually out?). |
@jreback Okay! (That fix is not released yet, will be included in the next release of CPython 3.6.) |
to_csv with Nan value at top row, unexpected "" in the csv file
Versions:
The text was updated successfully, but these errors were encountered: