Skip to content

Raise error in read_csv when arguments header and prefix both are not None #31383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Feb 3, 2020
Merged
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ MultiIndex
I/O
^^^
- Bug in :meth:`read_json` where integer overflow was occuring when json contains big number strings. (:issue:`30320`)
-
- `read_csv` will now raise a ``ValueError`` when arguments `header` and `prefix` both are not None. (:issue:`27394`)
-

Plotting
Expand Down
4 changes: 4 additions & 0 deletions pandas/io/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1888,6 +1888,10 @@ def __init__(self, src, **kwds):
if self._reader.header is None:
self.names = None
else:
if self.prefix:
raise ValueError(
"Argument prefix must be None if argument header is not None"
)
if len(self._reader.header) > 1:
# we have a multi index in the columns
(
Expand Down
10 changes: 9 additions & 1 deletion pandas/tests/io/parser/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from pandas._libs.tslib import Timestamp
from pandas.errors import DtypeWarning, EmptyDataError, ParserError

from pandas import DataFrame, Index, MultiIndex, Series, compat, concat
from pandas import DataFrame, Index, MultiIndex, Series, compat, concat, read_csv
import pandas._testing as tm

from pandas.io.parsers import CParserWrapper, TextFileReader, TextParser
Expand Down Expand Up @@ -2040,6 +2040,14 @@ def test_read_csv_memory_growth_chunksize(all_parsers):
pass


def test_read_csv_raises_on_header_prefix():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the all_parsers fixture here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first tried to do that in c053a8f but looks likey parsers.read_csv is not going to CParserWrapper. Which is what we want to test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure it is, what exactly was failing? this is the standard pattern for parser tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is to check whether the error is being raised or not, and the error is raised from CParserWrapper So, the whole test fails here.

Copy link
Member

@gfyoung gfyoung Feb 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to apply your fix to all engines. You just applied to CParserWrapper it seems.

I think the fix should actually be in ParserBase so that all engines get the fix.

c053a8f is the correct test.

# gh-27394
msg = "Argument prefix must be None if argument header is not None"
s = StringIO("0,1\n2,3")
with pytest.raises(ValueError, match=msg):
read_csv(s, header=0, prefix="_X")


def test_read_table_equivalency_to_read_csv(all_parsers):
# see gh-21948
# As of 0.25.0, read_table is undeprecated
Expand Down