BUG: Fix file descriptor leak #32598

roberthdevries · 2020-03-10T21:47:38Z

closes Unclosed file on EmptyDataError #31488
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

gfyoung · 2020-03-10T22:06:27Z

pandas/tests/io/parser/test_python_parser_only.py

@@ -314,3 +316,15 @@ def test_malformed_skipfooter(python_parser_only):
    msg = "Expected 3 fields in line 4, saw 5"
    with pytest.raises(ParserError, match=msg):
        parser.read_csv(StringIO(data), header=1, comment="#", skipfooter=1)
+
+
+def test_file_descriptor_leak(python_parser_only):


Can't we also test this against the C parser (and use the all_parsers fixture instead)?

No problem, done.

gfyoung · 2020-03-10T22:07:14Z

pandas/tests/io/parser/test_python_parser_only.py

+def test_file_descriptor_leak(python_parser_only):
+    # GH 31488
+    parser = python_parser_only
+    with open("empty.csv", "w"):


Can we try using our ensure_clean utility instead?

gfyoung · 2020-03-10T22:08:54Z

pandas/tests/io/parser/test_python_parser_only.py

+    proc = psutil.Process()
+    with pytest.raises(EmptyDataError):
+        parser.read_csv("empty.csv")
+    assert not proc.open_files()


I would check that the list of open files does not contain our file, not that it's empty (more vulnerable to failure if the testing environment happens to have other files open unrelated to our test).

I have now changed the test to a before/after check that triggers when the results differ.

gfyoung · 2020-03-10T22:09:15Z

pandas/tests/io/parser/test_python_parser_only.py

+        pass
+
+    proc = psutil.Process()
+    with pytest.raises(EmptyDataError):


Check the error message as well.

WillAyd

Thanks! Looks good just some standardization on the test that can be improved

pandas/tests/io/parser/test_common.py

pandas/io/parsers.py

jreback · 2020-03-14T15:50:39Z

pandas/tests/io/parser/test_common.py

+    # GH 31488
+    import psutil
+
+    proc = psutil.Process()


can you add a comment here very similar to what you posted above (e.g. why using psutil open files to check)

pandas/tests/io/parser/test_common.py

Also add a comment why combining tm.ensure_clean and td.check_file_leaks don't play along nicely together

…ensure_clean

jreback · 2020-03-15T00:37:46Z

thanks @roberthdevries

gfyoung added API - Consistency Internal Consistency of API/Behavior Bug IO CSV read_csv, to_csv and removed API - Consistency Internal Consistency of API/Behavior labels Mar 10, 2020

gfyoung reviewed Mar 10, 2020

View reviewed changes

datapythonista changed the title ~~Fix file descriptor leak~~ BUG: Fix file descriptor leak Mar 11, 2020

roberthdevries force-pushed the fix-31488-read_csv-file-leak-on-EmptyDataError branch 2 times, most recently from 0e5012f to 830ed1d Compare March 12, 2020 21:41

WillAyd requested changes Mar 13, 2020

View reviewed changes

pandas/tests/io/parser/test_common.py Show resolved Hide resolved

roberthdevries requested a review from WillAyd March 14, 2020 11:22

jreback requested changes Mar 14, 2020

View reviewed changes

jreback added this to the 1.1 milestone Mar 14, 2020

roberthdevries force-pushed the fix-31488-read_csv-file-leak-on-EmptyDataError branch from 830ed1d to bc238a0 Compare March 14, 2020 16:44

WillAyd requested changes Mar 14, 2020

View reviewed changes

pandas/tests/io/parser/test_common.py Show resolved Hide resolved

roberthdevries requested review from WillAyd and jreback March 14, 2020 22:17

roberthdevries added 11 commits March 14, 2020 23:51

Fix file descriptor leak

1cd0704

Add whatsnew entry

8454c3f

Use ensure_clean

3bc072c

Add the expected exception message

bf96995

isort fixes

e23f02e

Made the test non-python parser specific

c284787

import psutil fails on several travis runs due to missing psutil

e7f07df

Fix import order

2203c71

Fix for black formatter

996d474

Add TypeError and ValueError to the list of caught exceptions

8fa751e

Also add a comment why combining tm.ensure_clean and td.check_file_leaks don't play along nicely together

Use td.check_file_leaks() in a way that works in combination with tm.…

d08bc10

…ensure_clean

roberthdevries force-pushed the fix-31488-read_csv-file-leak-on-EmptyDataError branch from bc238a0 to d08bc10 Compare March 14, 2020 22:59

jreback approved these changes Mar 15, 2020

View reviewed changes

jreback merged commit 2e114ce into pandas-dev:master Mar 15, 2020

roberthdevries deleted the fix-31488-read_csv-file-leak-on-EmptyDataError branch March 15, 2020 07:14

SeeminSyed pushed a commit to CSCD01-team01/pandas that referenced this pull request Mar 22, 2020

BUG: Fix file descriptor leak (pandas-dev#32598)

d45f5af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Fix file descriptor leak #32598

BUG: Fix file descriptor leak #32598

roberthdevries commented Mar 10, 2020

gfyoung Mar 10, 2020 •

edited

Loading

roberthdevries Mar 11, 2020

gfyoung Mar 10, 2020

roberthdevries Mar 11, 2020

gfyoung Mar 10, 2020 •

edited

Loading

roberthdevries Mar 11, 2020

gfyoung Mar 10, 2020

roberthdevries Mar 11, 2020

WillAyd left a comment

jreback Mar 14, 2020

roberthdevries Mar 14, 2020

jreback commented Mar 15, 2020

BUG: Fix file descriptor leak #32598

BUG: Fix file descriptor leak #32598

Conversation

roberthdevries commented Mar 10, 2020

gfyoung Mar 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung Mar 10, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WillAyd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Mar 15, 2020

gfyoung Mar 10, 2020 •

edited

Loading

gfyoung Mar 10, 2020 •

edited

Loading