-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DataFrame.to_csv not using correct line terminator value #20353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you provide as a reproducible example? I suspect that this is a Windows problem of unnecessarily inserting carriage-returns, but can't confirm yet. |
Same problem for me on Windows 10,
|
I found that this problem is still in v0.23. Code Sample, a copy-pastable example if possibleinitialization
import pandas as pd
data = pd.DataFrame({
"integer":[1,2,3],
"string_with_lf":["abc","d\nef","g\nh\n\ni"],
"char":["X","Y","Z"]
})
print(data)
Method 1
data.to_csv("test.csv", sep=",", float_format='%.2f',index=False, line_terminator='\n',encoding='utf-8')
print(pd.read_csv("test.csv"))
print("-------")
with open("test.csv", mode='rb') as f:
print(f.read())
print(data)
Method 2
with open("test2.csv", mode='w', newline='\n') as f:
data.to_csv(f, sep=",", float_format='%.2f',index=False, line_terminator='\n',encoding='utf-8')
print(pd.read_csv("test2.csv"))
print("-------")
with open("test2.csv", mode='rb') as f:
print(f.read())
Problem descriptionAs seen in "Method 1" sample, when using Expected OutputExpect "Method 1" to output the csv described in "Method 2" sample. Output of
|
So this is definitely a Windows thing. I ran your samples on a Linux machine, and they produce the same output as your Method 2 in both cases. Thus, it would the case that Windows is quietly inserting carriage returns in the file. |
this is a duplicate of #17365 - though these examples are slightly better so can close the other issue further i suspect we have this issue recorders elsewhere - can u search |
Sorry for late reply. |
@deflatSOCO : Don't worry about that. That was more addressed to me. Actually, he mostly likely meant "recorded" instead of "recorders" @jreback : There are similar-sounding issues, but they are all either closed or insufficiently overlapping. |
@gfyoung Thanks for reply. |
@deflatSOCO : Go for it! |
* related issue: pandas-dev#20353
* related issue: pandas-dev#20353
Late to comment but why is this considered an issue? From a Python perspective isn't Method 1 is using universal line mode so |
@WillAyd : The issue is that if you specify |
Hmm OK. So is the expectation that line_terminator also disables universal newline support? Just want to be clear on expectations as if you take pandas out of the picture the below would be equivalent to the original issue (correct me if I am wrong): # Matches method 1, i.e. would still write \r\n
with open('somefile.txt', 'w') as fn:
fn.write("foo" + "\n")
# Matches method 2
with open('somefile.txt', 'w', newline="\n") as fn:
fn.write("foo" + "\n") Just wondering if it's not preferable to add |
|
The current flow of
I observed that the string buffer in step1 uses '\n's as expected, but they changed to '\r\n' after step2&3, Again, I'm new to this community, so please correct me if I have the wrong recognitions. If we need more discussions, I suppose I should close my PR until how to resolve this issue is decided. Is that OK? |
@deflatSOCO : Don't need to close. We're just busy people, so sometimes, issues / PR's go dark for a little. Thanks for pinging us again! |
* re-defined testcases that suits conversations in PR pandas-dev#21406 * changed default value of line_terminator to os.linesep * changed API document of DataFrame.to_csv * changed "newline" value of "open()" from '\n' to '' * Updated whatsnew document related pages: * Issue pandas-dev#20353 * PR pandas-dev#21406
* Updates: * Updated expected values for some tests about 'to_csv()' method, to deal with new default value of 'line_terminator' arg. * Related Issue: * Issue pandas-dev#20353 * PR pandas-dev#21406
* related issue: pandas-dev#20353
* Added test for new test util `convert_rows_list_to_csv_str` * Edited what's new Related Issue: pandas-dev#20353
* Use OS line terminator if none is provided * Enforce line terminator selection if one is Originally authored by @deflatSOCO, but reapplied by @gfyoung due to enormous merge conflicts. Closes pandas-devgh-20353.
* Use OS line terminator if none is provided * Enforce line terminator selection if one is Originally authored by @deflatSOCO, but reapplied by @gfyoung due to enormous merge conflicts. Closes pandas-devgh-20353.
* Use OS line terminator if none is provided * Enforce line terminator selection if one is Originally authored by @deflatSOCO, but reapplied by @gfyoung due to enormous merge conflicts. Closes gh-20353.
* Use OS line terminator if none is provided * Enforce line terminator selection if one is Originally authored by @deflatSOCO, but reapplied by @gfyoung due to enormous merge conflicts. Closes pandas-devgh-20353.
Method 1:
Method 2:
Problem description
I noticed a strange behavior when using pandas.DataFrame.to_csv method on Windows (pandas version 0.20.3). When calling the method using method 1 with a file path, it's creating a new file using the \r line terminator, I had to use method two to make it work.
The text was updated successfully, but these errors were encountered: