BUG: Help python csv engine read binary buffers #27925

fiendish · 2019-08-15T04:15:34Z

The file buffer given to read_csv could have been opened in
binary mode, but the python csv reader errors on binary buffers.

closes read_csv c engine accepts binary mode data and python engine rejects it #23779
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

The file buffer given to read_csv could have been opened in binary mode, but the python csv reader errors on binary buffers. closes #23779

fiendish · 2019-08-15T04:55:38Z

What does Linux py37_np_dev know that I don't?

jreback · 2019-08-15T12:56:14Z

doc/source/whatsnew/v0.25.1.rst

@@ -105,7 +105,7 @@ I/O
 ^^^

 - Avoid calling ``S3File.s3`` when reading parquet, as this was removed in s3fs version 0.3.0 (:issue:`27756`)
-
+- read_csv now accepts binary mode file buffers when using the Python csv engine (:issue:`23779`)


use :meth:`read_csv` ; move this note to 1.0

jreback · 2019-08-15T12:56:44Z

pandas/tests/io/parser/test_python_parser_only.py

@@ -296,3 +296,10 @@ def test_malformed_skipfooter(python_parser_only):
    msg = "Expected 3 fields in line 4, saw 5"
    with pytest.raises(ParserError, match=msg):
        parser.read_csv(StringIO(data), header=1, comment="#", skipfooter=1)
+
+
+def test_binary_buffer(python_parser_only, csv1):


run this for both csv engines

jreback · 2019-08-15T12:56:57Z

pandas/tests/io/parser/test_python_parser_only.py

+    # see gh-23779
+    parser = python_parser_only
+    with open(csv1, "rb") as f:
+        parser.read_csv(f)


assert the result contents

move this test to parser/test_common.py see if you can co-locate with other buffer type tests.

test both csv engines, assert equality between ascii and binary modes, colocate with other buffer tests

fiendish · 2019-08-15T16:45:46Z

Made the requested changes. Do you want a rebase to squash?

TomAugspurger

No need to squash.

pandas/io/common.py

fiendish · 2019-08-15T18:10:18Z

Is it normal for random unrelated errors to fail builds?

TomAugspurger · 2019-08-15T18:14:27Z

Yes, we're debugging it still. Restarted azure.

jreback · 2019-08-16T12:20:52Z

pandas/io/common.py

-        f = TextIOWrapper(f, encoding=encoding, newline="")
-        handles.append(f)
+        g = TextIOWrapper(f, encoding=encoding, newline="")
+        if not isinstance(f, no_close):


this is hard to follow, can you just use BufferedIOBase here rather than no_close?

pandas/tests/io/parser/test_common.py

fiendish · 2019-08-16T15:28:30Z

this is hard to follow, can you just use BufferedIOBase here rather than no_close?

I figured this way was more general in case there were future issues discovered beyond just BufferedIOBase, but consider it done.

TomAugspurger · 2019-08-19T16:23:34Z

Thanks @fiendish!

* BUG: Help python csv engine read binary buffers The file buffer given to read_csv could have been opened in binary mode, but the python csv reader errors on binary buffers. closes pandas-dev#23779

fiendish added 2 commits August 15, 2019 00:03

BUG: Help python csv engine read binary buffers

5e3c3e3

The file buffer given to read_csv could have been opened in binary mode, but the python csv reader errors on binary buffers. closes #23779

whatsnew entry

0ac1bce

jreback added the IO CSV read_csv, to_csv label Aug 15, 2019

jreback requested changes Aug 15, 2019

View reviewed changes

fiendish added 4 commits August 15, 2019 12:04

satisfy gh-14418 for binary mode files

7e270d2

update tests per requested changes

b3b9d2f

test both csv engines, assert equality between ascii and binary modes, colocate with other buffer tests

move whatsnew note to 1.0

0a60af2

black formatting

9d73aaa

TomAugspurger reviewed Aug 15, 2019

View reviewed changes

pandas/io/common.py Outdated Show resolved Hide resolved

fiendish added 2 commits August 15, 2019 13:30

BytesIO is a subclass of BufferedIOBase

03bceb1

black formatting

ab0878d

jreback requested changes Aug 16, 2019

View reviewed changes

fiendish added 2 commits August 16, 2019 12:03

remove no_close and compare with non-open read

d662532

Merge branch 'master' into patch-1

01741dc

TomAugspurger approved these changes Aug 19, 2019

View reviewed changes

TomAugspurger merged commit 6e0ab71 into pandas-dev:master Aug 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Help python csv engine read binary buffers #27925

BUG: Help python csv engine read binary buffers #27925

Uh oh!

fiendish commented Aug 15, 2019

Uh oh!

fiendish commented Aug 15, 2019 •

edited

Loading

Uh oh!

jreback Aug 15, 2019

Uh oh!

jreback Aug 15, 2019

Uh oh!

jreback Aug 15, 2019

Uh oh!

jreback Aug 15, 2019

Uh oh!

fiendish commented Aug 15, 2019

Uh oh!

TomAugspurger left a comment

Uh oh!

Uh oh!

fiendish commented Aug 15, 2019 •

edited

Loading

Uh oh!

TomAugspurger commented Aug 15, 2019

Uh oh!

jreback Aug 16, 2019

Uh oh!

Uh oh!

fiendish commented Aug 16, 2019

Uh oh!

TomAugspurger commented Aug 19, 2019

Uh oh!

Uh oh!

Uh oh!

BUG: Help python csv engine read binary buffers #27925

BUG: Help python csv engine read binary buffers #27925

Uh oh!

Conversation

fiendish commented Aug 15, 2019

Uh oh!

fiendish commented Aug 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback Aug 15, 2019

Choose a reason for hiding this comment

Uh oh!

jreback Aug 15, 2019

Choose a reason for hiding this comment

Uh oh!

jreback Aug 15, 2019

Choose a reason for hiding this comment

Uh oh!

jreback Aug 15, 2019

Choose a reason for hiding this comment

Uh oh!

fiendish commented Aug 15, 2019

Uh oh!

TomAugspurger left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fiendish commented Aug 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TomAugspurger commented Aug 15, 2019

Uh oh!

jreback Aug 16, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fiendish commented Aug 16, 2019

Uh oh!

TomAugspurger commented Aug 19, 2019

Uh oh!

Uh oh!

fiendish commented Aug 15, 2019 •

edited

Loading

fiendish commented Aug 15, 2019 •

edited

Loading