Skip to content

BUG: Fix line_terminator option #23608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pandas/io/formats/csvs.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ class CSVFormatter(object):
def __init__(self, obj, path_or_buf=None, sep=",", na_rep='',
float_format=None, cols=None, header=True, index=True,
index_label=None, mode='w', nanRep=None, encoding=None,
compression='infer', quoting=None, line_terminator='\n',
compression='infer', quoting=None, line_terminator=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to need a doc-string versionchanged, and an update

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On closer look, #24290 covered this. In fact, the same PR may have corrected the entire issue as long as the CSVFormatter class is always called with line_terminator=None.

I made a minor docs update to reflect that PR, let me know if you had something else in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reference issue is off? in any event, your change looks fine.

chunksize=None, tupleize_cols=False, quotechar='"',
date_format=None, doublequote=True, escapechar=None,
decimal='.'):
Expand Down
17 changes: 17 additions & 0 deletions pandas/tests/io/formats/test_to_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -563,3 +563,20 @@ def test_to_csv_compression(self, compression_only,
result = pd.read_csv(path, index_col=0,
compression=read_compression)
tm.assert_frame_equal(result, df)


def test_csv_formatter_line_terminator_default(monkeypatch):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than this, can you just modify the appropriate tests to pass in ['\n', os.linesep, None] as parameters (we might already have this)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify the appropriate tests to pass

@jreback There isn't an existing test of the CSVFormatter directly.

The intent of the PR is to ensure that if no line_terminator is specified, the default is os.linesep. Currently, CSVFormatter must be called with line_terminator=None or else \n will be used.

When I opened this PR, I was looking at v0.24.3, so I didn't realize the bug had been fixed in #21406 by calling the CSVFormatter class with line_terminator=None. I thought I might as well continue the PR so future devs don't have to remember to call CSVFormatter(line_terminator=None) if they expect the platform specific os.linesep behavior. Since most devs use \n platforms, it would be easy to miss this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this test is necessary and slightly confusing, can you remove.

# see GH #23608
# ensure default line_terminator used is os.linesep
from pandas.io.formats.csvs import CSVFormatter
df = DataFrame()

with monkeypatch.context() as m:
m.setattr(os, 'linesep', 'foo')
assert os.linesep == 'foo'

formatter_with_default = CSVFormatter(df)
assert formatter_with_default.line_terminator == os.linesep

formatter_with_argument = CSVFormatter(df, line_terminator='bar')
assert formatter_with_argument.line_terminator == 'bar'