Raise error in read_csv when arguments header and prefix both are not None #31383

rushabh-v · 2020-01-28T09:40:27Z

closes pd.read_csv prefix parameter seems do not works #27394
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

simonjayhawkins

Thanks @rushabh-v. can you add a what's new note. What's the current behaviour? Does this need a deprecation cycle?

pandas/tests/io/parser/test_header.py

rushabh-v · 2020-01-28T15:08:26Z

All checks successful.
No, in my view it doesn't need depreciation cycle.

And I have made all the requested changes @simonjayhawkins

simonjayhawkins · 2020-01-28T15:42:37Z

No, it doesn't need depreciation cycle.

IIUC then currently prefix is ignored if header passed.

>>> from io import StringIO
>>> s = StringIO("0,1\n2,3")
>>>
>>> import pandas as pd
>>> pd.__version__
'1.0.0rc0+228.g4edcc5541'
>>>
>>> pd.read_csv(s, header=0, prefix="_X")
   0  1
0  2  3
>>>

This would now raise a ValueError.

IMO this is a breaking change. see what others think.

rushabh-v · 2020-01-28T15:54:55Z

IMO this is a breaking change

oh, I think you are right.

WillAyd

lgtm. The intention here is to raise a ValueError (its a bug and confusing to silently ignore)

@TomAugspurger

pandas/io/parsers.py

WillAyd

lgtm @TomAugspurger

TomAugspurger

Changes look OK, though the test is likely in the wrong file. Should be in pandas/tests/io/parser/test_common.py probably.

TomAugspurger · 2020-01-31T22:48:11Z

pandas/tests/frame/test_to_csv.py

@@ -575,6 +575,13 @@ def test_to_csv_headers(self):
            recons.reset_index(inplace=True)
            tm.assert_frame_equal(to_df, recons)

+    def test_to_csv_raises_on_header_prefix(self):


Is this testing to_csv or read_csv? Looks like it's in the to_csv file, but is testing read_csv?

It's testing read_csv that's a mistake in naming it.

But I actually tried putting this test in pandas/tests/io/parser/test_common.py. But the problem there is that all the tests use io.parsers.read_csv there. And io.parsers.read_csv doesn't seem to be going into CParserWrapper. So I tried putting it here. But now I am putting it back in test_common.py and will import the pandas.read_csv there.

Can you take a look, please?

jreback · 2020-02-01T15:01:32Z

pandas/tests/io/parser/test_common.py

@@ -2040,6 +2040,14 @@ def test_read_csv_memory_growth_chunksize(all_parsers):
            pass


+def test_read_csv_raises_on_header_prefix():


can you use the all_parsers fixture here

I first tried to do that in c053a8f but looks likey parsers.read_csv is not going to CParserWrapper. Which is what we want to test.

sure it is, what exactly was failing? this is the standard pattern for parser tests.

The test is to check whether the error is being raised or not, and the error is raised from CParserWrapper So, the whole test fails here.

You need to apply your fix to all engines. You just applied to CParserWrapper it seems.

I think the fix should actually be in ParserBase so that all engines get the fix.

c053a8f is the correct test.

gfyoung

@rushabh-v : You're on the right track, but I think you tried too hard to adjust the tests to your fixes, when it really should be the other way around. See my earlier comments.

pep8speaks · 2020-02-02T05:37:38Z

Hello @rushabh-v! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-02-03 09:25:48 UTC

pandas/tests/io/parser/test_common.py

pandas/io/parsers.py

gfyoung

These changes look good. Just want to make sure that tests pass.

If they fail, we can update those accordingly.

rushabh-v · 2020-02-02T08:16:14Z

@gfyoung 2 tests are failing, I don't see any obvious connection of them with this PR.
can you take a look, please? https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=27452&view=logs&j=a3a13ea8-7cf0-5bdb-71bb-6ac8830ae35c&t=add65f64-6c25-5783-8fd6-d9aa1b63d9d4&l=119

gfyoung · 2020-02-02T09:13:49Z

Indeed, those test failures look unrelated. I just restarted the build.

rushabh-v · 2020-02-02T09:42:24Z

Indeed, those test failures look unrelated. I just restarted the build.

same test failure again.

gfyoung · 2020-02-02T09:48:01Z

You may need to rebase or merge. Unclear why these warnings are popping up.

cc @pandas-dev/pandas-core : is there something with matplotlib that changed recently?

…e-head

rushabh-v · 2020-02-02T10:41:04Z

Fails after merge too

rushabh-v · 2020-02-03T07:55:16Z

All greens now @gfyoung

doc/source/whatsnew/v1.1.0.rst

pandas/tests/io/parser/test_common.py

gfyoung · 2020-02-03T15:52:36Z

Thanks @rushabh-v !

Pandas 1.1 has broken Record Mover's usage of the read_csv() function by adding error checking in cases where a certain argument would be unused. Details of the Pandas change: * pandas-dev/pandas#27394 * pandas-dev/pandas#31383 See Records Mover test failures here: * https://app.circleci.com/pipelines/github/bluelabsio/records-mover/1089/workflows/e62f1cf0-f8d0-4e22-9652-112df72b02b8/jobs/9439

rushabh-v added 3 commits January 28, 2020 15:00

add error msg and test

c053a8f

run black pandas

071ba5b

remove extra quotes

ed8278e

simonjayhawkins added Error Reporting Incorrect or improved errors from pandas IO CSV read_csv, to_csv labels Jan 28, 2020

simonjayhawkins reviewed Jan 28, 2020

View reviewed changes

pandas/tests/io/parser/test_header.py Outdated Show resolved Hide resolved

pandas/tests/io/parser/test_header.py Outdated Show resolved Hide resolved

rushabh-v added 3 commits January 28, 2020 19:35

change location of the test

7231f52

add whatsnew

a5fb107

add gh

13d8644

rushabh-v requested review from jreback, WillAyd and TomAugspurger January 28, 2020 15:58

WillAyd approved these changes Jan 31, 2020

View reviewed changes

WillAyd requested changes Jan 31, 2020

View reviewed changes

pandas/io/parsers.py Outdated Show resolved Hide resolved

change position of the raise

a003c51

rushabh-v force-pushed the pre-head branch from 089079b to a003c51 Compare January 31, 2020 17:34

WillAyd approved these changes Jan 31, 2020

View reviewed changes

TomAugspurger reviewed Jan 31, 2020

View reviewed changes

rushabh-v added 2 commits February 1, 2020 13:10

change the location of the test

77c8ee0

sort imports

39cd6d5

rushabh-v force-pushed the pre-head branch from 7385d94 to 39cd6d5 Compare February 1, 2020 10:51

use isort

a2cb6a4

jreback requested changes Feb 1, 2020

View reviewed changes

jreback requested a review from gfyoung February 1, 2020 15:01

gfyoung requested changes Feb 2, 2020

View reviewed changes

add the check in ParserBase

204dbc2

add a None constraint

1bc350b

gfyoung reviewed Feb 2, 2020

View reviewed changes

pandas/tests/io/parser/test_common.py Outdated Show resolved Hide resolved

gfyoung reviewed Feb 2, 2020

View reviewed changes

pandas/tests/io/parser/test_common.py Outdated Show resolved Hide resolved

remove import read_csv

a03d5b5

gfyoung reviewed Feb 2, 2020

View reviewed changes

pandas/io/parsers.py Show resolved Hide resolved

make requested changes

cc1f6f8

gfyoung reviewed Feb 2, 2020

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into pr…

5fc40bd

…e-head

gfyoung reviewed Feb 3, 2020

View reviewed changes

doc/source/whatsnew/v1.1.0.rst Outdated Show resolved Hide resolved

gfyoung reviewed Feb 3, 2020

View reviewed changes

pandas/tests/io/parser/test_common.py Show resolved Hide resolved

make requested changes

d962941

rushabh-v force-pushed the pre-head branch from 9451999 to d962941 Compare February 3, 2020 09:24

Merge branch 'master' into pre-head

1f35f6c

rushabh-v requested a review from gfyoung February 3, 2020 11:07

gfyoung approved these changes Feb 3, 2020

View reviewed changes

gfyoung merged commit a2721fd into pandas-dev:master Feb 3, 2020

gfyoung added this to the 1.1 milestone Feb 3, 2020

rushabh-v deleted the pre-head branch February 3, 2020 16:13

vinceatbluelabs mentioned this pull request Aug 11, 2020

Adjust to Pandas breaking change, bump Python version bluelabsio/records-mover#101

Merged

malinkallen mentioned this pull request Jan 12, 2021

Raise an error in read_csv when names and prefix both are not None #39123

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise error in read_csv when arguments header and prefix both are not None #31383

Raise error in read_csv when arguments header and prefix both are not None #31383

rushabh-v commented Jan 28, 2020 •

edited

Loading

simonjayhawkins left a comment

rushabh-v commented Jan 28, 2020 •

edited

Loading

simonjayhawkins commented Jan 28, 2020

rushabh-v commented Jan 28, 2020

WillAyd left a comment

WillAyd left a comment

TomAugspurger left a comment

TomAugspurger Jan 31, 2020

rushabh-v Feb 1, 2020

rushabh-v Feb 1, 2020

jreback Feb 1, 2020

rushabh-v Feb 1, 2020

jreback Feb 1, 2020

rushabh-v Feb 1, 2020

gfyoung Feb 2, 2020 •

edited

Loading

gfyoung left a comment

pep8speaks commented Feb 2, 2020 •

edited

Loading

gfyoung left a comment

rushabh-v commented Feb 2, 2020

gfyoung commented Feb 2, 2020

rushabh-v commented Feb 2, 2020 •

edited

Loading

gfyoung commented Feb 2, 2020

rushabh-v commented Feb 2, 2020 •

edited

Loading

rushabh-v commented Feb 3, 2020

gfyoung commented Feb 3, 2020

		@@ -2040,6 +2040,14 @@ def test_read_csv_memory_growth_chunksize(all_parsers):
		pass


		def test_read_csv_raises_on_header_prefix():

Raise error in read_csv when arguments header and prefix both are not None #31383

Raise error in read_csv when arguments header and prefix both are not None #31383

Conversation

rushabh-v commented Jan 28, 2020 • edited Loading

simonjayhawkins left a comment

Choose a reason for hiding this comment

rushabh-v commented Jan 28, 2020 • edited Loading

simonjayhawkins commented Jan 28, 2020

rushabh-v commented Jan 28, 2020

WillAyd left a comment

Choose a reason for hiding this comment

WillAyd left a comment

Choose a reason for hiding this comment

TomAugspurger left a comment

Choose a reason for hiding this comment

TomAugspurger Jan 31, 2020

Choose a reason for hiding this comment

rushabh-v Feb 1, 2020

Choose a reason for hiding this comment

rushabh-v Feb 1, 2020

Choose a reason for hiding this comment

jreback Feb 1, 2020

Choose a reason for hiding this comment

rushabh-v Feb 1, 2020

Choose a reason for hiding this comment

jreback Feb 1, 2020

Choose a reason for hiding this comment

rushabh-v Feb 1, 2020

Choose a reason for hiding this comment

gfyoung Feb 2, 2020 • edited Loading

Choose a reason for hiding this comment

gfyoung left a comment

Choose a reason for hiding this comment

pep8speaks commented Feb 2, 2020 • edited Loading

Comment last updated at 2020-02-03 09:25:48 UTC

gfyoung left a comment

Choose a reason for hiding this comment

rushabh-v commented Feb 2, 2020

gfyoung commented Feb 2, 2020

rushabh-v commented Feb 2, 2020 • edited Loading

gfyoung commented Feb 2, 2020

rushabh-v commented Feb 2, 2020 • edited Loading

rushabh-v commented Feb 3, 2020

gfyoung commented Feb 3, 2020

rushabh-v commented Jan 28, 2020 •

edited

Loading

rushabh-v commented Jan 28, 2020 •

edited

Loading

gfyoung Feb 2, 2020 •

edited

Loading

pep8speaks commented Feb 2, 2020 •

edited

Loading

rushabh-v commented Feb 2, 2020 •

edited

Loading

rushabh-v commented Feb 2, 2020 •

edited

Loading