Skip to content

BUG: Fix file descriptor leak #32598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,7 @@ I/O
- Bug in :meth:`read_csv` was raising `TypeError` when `sep=None` was used in combination with `comment` keyword (:issue:`31396`)
- Bug in :class:`HDFStore` that caused it to set to ``int64`` the dtype of a ``datetime64`` column when reading a DataFrame in Python 3 from fixed format written in Python 2 (:issue:`31750`)
- Bug in :meth:`read_excel` where a UTF-8 string with a high surrogate would cause a segmentation violation (:issue:`23809`)
- Bug in :meth:`read_csv` was causing a file descriptor leak on an empty file (:issue:`31488`)


Plotting
Expand Down
14 changes: 9 additions & 5 deletions pandas/io/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2273,11 +2273,15 @@ def __init__(self, f, **kwds):
# Get columns in two steps: infer from data, then
# infer column indices from self.usecols if it is specified.
self._col_indices = None
(
self.columns,
self.num_original_columns,
self.unnamed_cols,
) = self._infer_columns()
try:
(
self.columns,
self.num_original_columns,
self.unnamed_cols,
) = self._infer_columns()
except (TypeError, ValueError):
self.close()
raise

# Now self.columns has the set of columns that we will process.
# The original set is stored in self.original_columns.
Expand Down
14 changes: 14 additions & 0 deletions pandas/tests/io/parser/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

from pandas._libs.tslib import Timestamp
from pandas.errors import DtypeWarning, EmptyDataError, ParserError
import pandas.util._test_decorators as td

from pandas import DataFrame, Index, MultiIndex, Series, compat, concat
import pandas._testing as tm
Expand Down Expand Up @@ -2079,3 +2080,16 @@ def test_integer_precision(all_parsers):
result = parser.read_csv(StringIO(s), header=None)[4]
expected = Series([4321583677327450765, 4321113141090630389], name=4)
tm.assert_series_equal(result, expected)


def test_file_descriptor_leak(all_parsers):
# GH 31488

parser = all_parsers
with tm.ensure_clean() as path:

def test():
with pytest.raises(EmptyDataError, match="No columns to parse from file"):
parser.read_csv(path)

td.check_file_leaks(test)()