Skip to content

BUG: read_csv not raising when \n as sep #44899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.4.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -749,6 +749,7 @@ I/O
- Bug in :func:`read_csv` raising ``ValueError`` when names was longer than header but equal to data rows for ``engine="python"`` (:issue:`38453`)
- Bug in :class:`ExcelWriter`, where ``engine_kwargs`` were not passed through to all engines (:issue:`43442`)
- Bug in :func:`read_csv` raising ``ValueError`` when ``parse_dates`` was used with ``MultiIndex`` columns (:issue:`8991`)
- Bug in :func:`read_csv` not raising an ``ValueError`` when ``\n`` was specified as ``delimiter`` or ``sep`` which conflicts with ``lineterminator`` (:issue:`43528`)
- Bug in :func:`read_csv` converting columns to numeric after date parsing failed (:issue:`11019`)
- Bug in :func:`read_csv` not replacing ``NaN`` values with ``np.nan`` before attempting date conversion (:issue:`26203`)
- Bug in :func:`read_csv` raising ``AttributeError`` when attempting to read a .csv file and infer index column dtype from an nullable integer type (:issue:`44079`)
Expand Down
7 changes: 7 additions & 0 deletions pandas/io/parsers/readers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1460,6 +1460,13 @@ def _refine_defaults_read(
"delim_whitespace=True; you can only specify one."
)

if delimiter == "\n":
raise ValueError(
r"Specified \n as separator or delimiter. This forces the python engine "
"which does not accept a line terminator. Hence it is not allowed to use "
"the line terminator as separator.",
)

if delimiter is lib.no_default:
# assign default separator value
kwds["delimiter"] = delim_default
Expand Down
16 changes: 16 additions & 0 deletions pandas/tests/io/parser/common/test_common_basic.py
Original file line number Diff line number Diff line change
Expand Up @@ -788,6 +788,22 @@ def test_read_csv_delimiter_and_sep_no_default(all_parsers):
parser.read_csv(f, sep=" ", delimiter=".")


@pytest.mark.parametrize("kwargs", [{"delimiter": "\n"}, {"sep": "\n"}])
def test_read_csv_line_break_as_separator(kwargs, all_parsers):
# GH#43528
parser = all_parsers
data = """a,b,c
1,2,3
"""
msg = (
r"Specified \\n as separator or delimiter. This forces the python engine "
r"which does not accept a line terminator. Hence it is not allowed to use "
r"the line terminator as separator."
)
with pytest.raises(ValueError, match=msg):
parser.read_csv(StringIO(data), **kwargs)


def test_read_csv_posargs_deprecation(all_parsers):
# GH 41485
f = StringIO("a,b\n1,2")
Expand Down