Skip to content

BLD/IO: Elusive Travis ResourceWarning & closing files by default #22675

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mroeschke opened this issue Sep 12, 2018 · 14 comments · Fixed by #22679
Closed

BLD/IO: Elusive Travis ResourceWarning & closing files by default #22675

mroeschke opened this issue Sep 12, 2018 · 14 comments · Fixed by #22679
Labels
CI Continuous Integration Unreliable Test Unit tests that occasionally fail

Comments

@mroeschke
Copy link
Member

mroeschke commented Sep 12, 2018

Every once in a while Travis will raise a ResourceWarning

____________________ TestPythonParser.test_no_header_prefix ____________________
[gw0] darwin -- Python 3.5.6 /Users/travis/miniconda3/envs/pandas/bin/python
self = <pandas.tests.io.parser.test_parsers.TestPythonParser object at 0x11d030da0>
    def test_no_header_prefix(self):
        data = """1,2,3,4,5
    6,7,8,9,10
    11,12,13,14,15
    """
        df_pref = self.read_table(StringIO(data), sep=',', prefix='Field'
>                                 header=None)
pandas/tests/io/parser/header.py:48:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas/tests/io/parser/test_parsers.py:111: in read_table
    df = read_table(*args, **kwds)
../../../miniconda3/envs/pandas/lib/python3.5/contextlib.py:66: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
E           AssertionError: Caused unexpected warning(s): [('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg80000gn/T/tmpp8reunuatest_file.zip'>",), '/Users/travis/miniconda3/envs/pandas/lib/python3.5/site-packages/py/_vendored_packages/apipkg.py', 146)].

And currently our file closing mechanism for parser functions is like so:

pandas/pandas/io/parsers.py

Lines 459 to 463 in 73dd6ec

if should_close:
try:
filepath_or_buffer.close()
except: # noqa: flake8
pass

It seems pointless having a generic try/except within an if statement (why have the if statement at all then?). More broadly, should we attempt to close all file-like objects anyways even if they are not necessarily needed (like StringIO in this Travis case)? Might help alleviate this Travis error.

Some examples:

@mroeschke mroeschke changed the title BLD/IO: Elusive ResourceWarning & closing files by default BLD/IO: Elusive Travis ResourceWarning & closing files by default Sep 12, 2018
@TomAugspurger
Copy link
Contributor

I started to look into this yesterday, and will try again today. If anyone is able to reproduce locally, let me know.

The interesting part is that the errors always reference zip files like tmpp8reunuatest_file.zip, even though the failing test isn't a compression test. So an unrelated test is seemingly leaking a warning, and it's failing here?

@TomAugspurger TomAugspurger added CI Continuous Integration Unreliable Test Unit tests that occasionally fail labels Sep 12, 2018
@h-vetinari
Copy link
Contributor

Kept having lots of those as well, and commented about it in the closest-seeming issue I found (#13962 (comment)), including some links to failed builds (July 2018).

@jbrockmendel
Copy link
Member

So an unrelated test is seemingly leaking a warning

Is it the warning that's coming from the unrelated test or just the unclosed file?

@TomAugspurger
Copy link
Contributor

I was able to reproduce locally with pytest -v -n 2 -m "not single" -r xX pandas/tests/io --count=5 . So that's something....

Is it the warning that's coming from the unrelated test or just the unclosed file?

It seems like the unclosed file warning has to be from a separate test right? In e.g. https://travis-ci.org/pandas-dev/pandas/jobs/427395352#L2258 we shouldn't be opening / closing a zip file.

@jbrockmendel
Copy link
Member

It seems like the unclosed file warning has to be from a separate test right?

Yah, I'm just trying to rule out the extra-weird case where the ResourceWarning is being issued during the compression test but only showing up in the parser test, as opposed to the less-weird case where the compression test leaves a file open but that file only leads to a ResourceWarning later on.

Could check psutil.open_files() at the end of the compression test?

@TomAugspurger
Copy link
Contributor

Gotcha, I suspect (hope) that the less-weird case is what's happening.

@TomAugspurger
Copy link
Contributor

Hmm, this looks fishy: https://github.com/pandas-dev/pandas/blob/master/pandas/tests/io/test_pickle.py#L345-L348 We're creating a zipfile and not closing that. Let's give that a shot.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 13, 2018

This is seemingly not fixed by #22679.

https://circleci.com/gh/pandas-dev/pandas/20081?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

__________________ TestPythonParser.test_dtype_with_converter __________________

self = <pandas.tests.io.parser.test_parsers.TestPythonParser object at 0x7f85c29ee2e8>

    def test_dtype_with_converter(self):
        data = """a,b
    1.1,2.2
    1.2,2.3"""
        # dtype spec ignored if converted specified
        with tm.assert_produces_warning(ParserWarning):
            result = self.read_csv(StringIO(data), dtype={'a': 'i8'},
>                                  converters={'a': lambda x: str(x)})

pandas/tests/io/parser/dtypes.py:340: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/opt/conda/envs/pandas/lib/python3.5/contextlib.py:66: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

expected_warning = <class 'pandas.errors.ParserWarning'>
filter_level = 'always', clear = None, check_stacklevel = True

    @contextmanager
    def assert_produces_warning(expected_warning=Warning, filter_level="always",
                                clear=None, check_stacklevel=True):
...
            if expected_warning:
                msg = "Did not see expected warning of class {name!r}.".format(
                    name=expected_warning.__name__)
                assert saw_warning, msg
            assert not extra_warnings, ("Caused unexpected warning(s): {extra!r}."
>                                       ).format(extra=extra_warnings)
E           AssertionError: Caused unexpected warning(s): [('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='/tmp/tmp4hjpmwyntest_file.zip'>",), '/opt/conda/envs/pandas/lib/python3.5/importlib/_bootstrap.py', 192)].

@TomAugspurger TomAugspurger reopened this Sep 13, 2018
@mroeschke
Copy link
Member Author

Interesting, this error appears on a different test.

Does zip_file here need to be closed as well?

elif compression == 'zip':
import zipfile
zip_file = zipfile.ZipFile(path)
zip_names = zip_file.namelist()
if len(zip_names) == 1:
f = zip_file.open(zip_names.pop())
else:
raise ValueError('ZIP file {} error. Only one file per ZIP.'
.format(path))
else:
msg = 'Unrecognized compression type: {}'.format(compression)
raise ValueError(msg)
yield f
f.close()

@TomAugspurger
Copy link
Contributor

Hmm, that may be it. Let me play with that.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 13, 2018

No luck on that I think. Closing the archive doesn't seem to have any effect if the file is closed.

$ python -W "error::ResourceWarning"
Python 3.6.5 (default, Mar 30 2018, 06:41:53)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> zf = zipfile.ZipFile("test.zip")
>>> f = zf.open(zf.namelist()[0])
<CTRL-D>
Exception ignored in: <_io.FileIO name='test.zip' mode='rb' closefd=True>
ResourceWarning: unclosed file <_io.BufferedReader name='test.zip'>
$ python -W "error::ResourceWarning"
Python 3.6.5 (default, Mar 30 2018, 06:41:53)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import zipfile
>>> zf = zipfile.ZipFile("test.zip")
>>> f = zf.open(zf.namelist()[0])
>>> f.close()
<CTR-D>

@TomAugspurger
Copy link
Contributor

Regardless, #22699 should make this trivial to debug. The warning should fail the test it's actually popping up in.

@h-vetinari
Copy link
Contributor

h-vetinari commented Sep 15, 2018

Logging another case of this:

https://travis-ci.org/pandas-dev/pandas/jobs/428987214

=================================== FAILURES ===================================
_______________ TestPythonParser.test_categorical_dtype_encoding _______________
[gw0] linux -- Python 3.7.0 /home/travis/miniconda3/envs/pandas/bin/python
self = <pandas.tests.io.parser.test_parsers.TestPythonParser object at 0x7fe1e0d04b00>
datapath = <function datapath.<locals>.deco at 0x7fe1d33a4488>
    def test_categorical_dtype_encoding(self, datapath):
        # GH 10153
        pth = datapath('io', 'parser', 'data', 'unicode_series.csv')
        encoding = 'latin-1'
        expected = self.read_csv(pth, header=None, encoding=encoding)
        expected[1] = Categorical(expected[1])
        actual = self.read_csv(pth, header=None, encoding=encoding
                               dtype={1: 'category'})
        tm.assert_frame_equal(actual, expected)
        pth = datapath('io', 'parser', 'data', 'utf16_ex.txt')
        encoding = 'utf-16'
        expected = self.read_table(pth, encoding=encoding)
        expected = expected.apply(Categorical)
>       actual = self.read_table(pth, encoding=encoding, dtype='category')
[...]
E           AssertionError: Caused unexpected warning(s): [('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='/tmp/tmpevb83mjxtest_file.zip'>"), '/home/travis/miniconda3/envs/pandas/lib/python3.7/site-packages/numpy/core/numeric.py', 501)].

edit: and again... (edit2: seems a maintainer restarted this build just as I triggered another run, so now the log is gone)
https://travis-ci.org/pandas-dev/pandas/jobs/429005130

=================================== FAILURES ===================================
_______________ TestPythonParser.test_categorical_dtype_encoding _______________
[gw1] darwin -- Python 3.5.6 /Users/travis/miniconda3/envs/pandas/bin/python
self = <pandas.tests.io.parser.test_parsers.TestPythonParser object at 0x118c7b550>
datapath = <function datapath.<locals>.deco at 0x1182f4488>
    def test_categorical_dtype_encoding(self, datapath):
        # GH 10153
        pth = datapath('io', 'parser', 'data', 'unicode_series.csv')
        encoding = 'latin-1'
        expected = self.read_csv(pth, header=None, encoding=encoding)
        expected[1] = Categorical(expected[1])
        actual = self.read_csv(pth, header=None, encoding=encoding
                               dtype={1: 'category'})
        tm.assert_frame_equal(actual, expected)
        pth = datapath('io', 'parser', 'data', 'utf16_ex.txt')
        encoding = 'utf-16'
        expected = self.read_table(pth, encoding=encoding)
        expected = expected.apply(Categorical)
>       actual = self.read_table(pth, encoding=encoding, dtype='category')
[...]
E           AssertionError: Caused unexpected warning(s): [('ResourceWarning', ResourceWarning("unclosed file <_io.BufferedReader name='/var/folders/nz/vv4_9tw56nv9k3tkvyszvwg80000gn/T/tmp6wkui26ytest_file.zip'>",), '/Users/travis/build/pandas-dev/pandas/pandas/core/indexes/base.py', 933)].

edit3: https://travis-ci.org/pandas-dev/pandas/jobs/429023439

@mroeschke
Copy link
Member Author

Closed by #22699 I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Unreliable Test Unit tests that occasionally fail
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants