Skip to content

Raise an error in read_csv when names and prefix both are not None #39123

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
malinkallen opened this issue Jan 12, 2021 · 0 comments · Fixed by #41446
Closed

Raise an error in read_csv when names and prefix both are not None #39123

malinkallen opened this issue Jan 12, 2021 · 0 comments · Fixed by #41446
Labels
Enhancement Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv
Milestone

Comments

@malinkallen
Copy link

Problem description

As a result of the discussion in issue #27394, read_csv is changed such that it raises an error when both header and prefix are different from None. A user had misunderstood how to (not) use header and prefix together. I think that the usage of namesand prefix can be misunderstood in a similar way.

It could also be that a user accidentally provides both arguments and expects prefix to have effect. Right now, it seems like prefix is silently ignored when names is provided.

Describe the solution you'd like

Raise a ValueError when the read_csv arguments prefix and names both differ from None, in accordance with issue #27394 and pull request #31383.

API breaking implications

This will "break" code that passes values (!=None) for both prefix and names, but since it was an accepted solution for issue #27394, I think it could be used here as well.

Describe alternatives you've considered

Another possibility is to issue a warning instead of an error, but that would be inconsistent with the behavior when prefix and header is both not None.

Additional context

Examples below are run in pandas version 1.3.0.dev0+210.g9f1a41dee.

When I run

pandas.read_csv("my_data.csv", prefix="XZ", header=0)

I get the following output:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 605, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 457, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 814, in __init__
    self._engine = self._make_engine(self.engine)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1045, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1853, in __init__
    ParserBase.__init__(self, kwds)
  File "/home/malin/workspace/py/pandas/pandas/io/parsers.py", line 1334, in __init__
    raise ValueError(
ValueError: Argument prefix must be None if argument header is not None

but running

pandas.read_csv("my_data.csv", prefix="XZ", names=["a", "b", "c"])

works fine, except that prefix is silently ignored.

@malinkallen malinkallen added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 12, 2021
@alimcmaster1 alimcmaster1 added IO CSV read_csv, to_csv Error Reporting Incorrect or improved errors from pandas and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 14, 2021
@jreback jreback added this to the 1.3 milestone May 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants