Skip to content

BUG: windows with TemporaryFile an read_csv #13398 #13481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.18.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -493,6 +493,7 @@ Bug Fixes
- Bug in ``pd.read_csv()`` in which the ``nrows`` argument was not properly validated for both engines (:issue:`10476`)
- Bug in ``pd.read_csv()`` with ``engine='python'`` in which infinities of mixed-case forms were not being interpreted properly (:issue:`13274`)
- Bug in ``pd.read_csv()`` with ``engine='python'`` in which trailing ``NaN`` values were not being parsed (:issue:`13320`)
- Bug in ``pd.read_csv()`` with ``engine='python'`` when reading from a tempfile.TemporaryFile on Windows with Python 3, separator expressed as a regex (:issue:`13398`)
- Bug in ``pd.read_csv()`` that prevents ``usecols`` kwarg from accepting single-byte unicode strings (:issue:`13219`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some grammar issues in that sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, changed

- Bug in ``pd.read_csv()`` that prevents ``usecols`` from being an empty set (:issue:`13402`)
- Bug in ``pd.read_csv()`` with ``engine=='c'`` in which null ``quotechar`` was not accepted even though ``quoting`` was specified as ``None`` (:issue:`13411`)
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1868,7 +1868,7 @@ class MyDialect(csv.Dialect):

else:
def _read():
line = next(f)
line = f.readline()
pat = re.compile(sep)
yield pat.split(line.strip())
for line in f:
Expand Down
14 changes: 14 additions & 0 deletions pandas/io/tests/parser/python_parser_only.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,3 +171,17 @@ def test_read_table_buglet_4x_multiindex(self):
columns=list('abcABC'), index=list('abc'))
actual = self.read_table(StringIO(data), sep='\s+')
tm.assert_frame_equal(actual, expected)

def test_temporary_file(self):
# GH13398
data1 = "0 0"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? What's wrong with data = "0 0"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing. It's late Sunday evening with a crappy game on tv :p

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Join the club! 😄

from tempfile import TemporaryFile
new_file = TemporaryFile("w+")
new_file.write(data1)
new_file.flush()
new_file.seek(0)

result = self.read_csv(new_file, sep=r"\s*", header=None)
expected = DataFrame([[0, 0]])
tm.assert_frame_equal(result, expected)