Skip to content

BUG: windows with TemporaryFile an read_csv #13398 #13481

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
2 changes: 1 addition & 1 deletion pandas/io/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1868,7 +1868,7 @@ class MyDialect(csv.Dialect):

else:
def _read():
line = next(f)
line = f.readline()
pat = re.compile(sep)
yield pat.split(line.strip())
for line in f:
Expand Down
22 changes: 22 additions & 0 deletions pandas/io/tests/parser/python_parser_only.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,3 +171,25 @@ def test_read_table_buglet_4x_multiindex(self):
columns=list('abcABC'), index=list('abc'))
actual = self.read_table(StringIO(data), sep='\s+')
tm.assert_frame_equal(actual, expected)

def test_temporary_file(self):
# GH13398
data1 = """index,A,B,C,D
foo,2,3,4,5
bar,7,8,9,10
baz,12,13,14,15
qux,12,13,14,15
foo2,12,13,14,15
bar2,12,13,14,15
"""
data2 = data1.replace(",", " ")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? What's wrong with data = "0 0"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing. It's late Sunday evening with a crappy game on tv :p

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Join the club! 😄

from tempfile import TemporaryFile
new_file = TemporaryFile("w+")
new_file.write(data2)
new_file.flush()
new_file.seek(0)

result = self.read_csv(new_file, sep=r"\s+", engine="python")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to specify the engine here because it's already specified as Python only. General rule of thumb for these tests: you shouldn't need to explicitly specify the engine.

expected = self.read_csv(StringIO(data1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better practice is to explicitly define your expected output by constructing the DataFrame itself. If you want, you can simplify your data input to make this construction easier to write out (does your data need to be that long for the error to be triggered - why not that nice simple data you used in your initial commit?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to be too clever. I reused the data from test_common, should have reverted it back.
Should be good now.

tm.assert_frame_equal(result, expected)