Skip to content

BUG: Fix file descriptor leak #32598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

roberthdevries
Copy link
Contributor

@gfyoung gfyoung added API - Consistency Internal Consistency of API/Behavior Bug IO CSV read_csv, to_csv and removed API - Consistency Internal Consistency of API/Behavior labels Mar 10, 2020
@@ -314,3 +316,15 @@ def test_malformed_skipfooter(python_parser_only):
msg = "Expected 3 fields in line 4, saw 5"
with pytest.raises(ParserError, match=msg):
parser.read_csv(StringIO(data), header=1, comment="#", skipfooter=1)


def test_file_descriptor_leak(python_parser_only):
Copy link
Member

@gfyoung gfyoung Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we also test this against the C parser (and use the all_parsers fixture instead)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, done.

def test_file_descriptor_leak(python_parser_only):
# GH 31488
parser = python_parser_only
with open("empty.csv", "w"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try using our ensure_clean utility instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

proc = psutil.Process()
with pytest.raises(EmptyDataError):
parser.read_csv("empty.csv")
assert not proc.open_files()
Copy link
Member

@gfyoung gfyoung Mar 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would check that the list of open files does not contain our file, not that it's empty (more vulnerable to failure if the testing environment happens to have other files open unrelated to our test).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now changed the test to a before/after check that triggers when the results differ.

pass

proc = psutil.Process()
with pytest.raises(EmptyDataError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the error message as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@datapythonista datapythonista changed the title Fix file descriptor leak BUG: Fix file descriptor leak Mar 11, 2020
@roberthdevries roberthdevries force-pushed the fix-31488-read_csv-file-leak-on-EmptyDataError branch 2 times, most recently from 0e5012f to 830ed1d Compare March 12, 2020 21:41
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good just some standardization on the test that can be improved

@roberthdevries roberthdevries requested a review from WillAyd March 14, 2020 11:22
# GH 31488
import psutil

proc = psutil.Process()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment here very similar to what you posted above (e.g. why using psutil open files to check)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@jreback jreback added this to the 1.1 milestone Mar 14, 2020
@roberthdevries roberthdevries force-pushed the fix-31488-read_csv-file-leak-on-EmptyDataError branch from 830ed1d to bc238a0 Compare March 14, 2020 16:44
@roberthdevries roberthdevries force-pushed the fix-31488-read_csv-file-leak-on-EmptyDataError branch from bc238a0 to d08bc10 Compare March 14, 2020 22:59
@jreback jreback merged commit 2e114ce into pandas-dev:master Mar 15, 2020
@jreback
Copy link
Contributor

jreback commented Mar 15, 2020

thanks @roberthdevries

@roberthdevries roberthdevries deleted the fix-31488-read_csv-file-leak-on-EmptyDataError branch March 15, 2020 07:14
SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unclosed file on EmptyDataError
4 participants