Skip to content

BUG: read_csv file handle not closed after error #58131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 3 tasks
ravcio opened this issue Apr 3, 2024 · 6 comments
Open
2 of 3 tasks

BUG: read_csv file handle not closed after error #58131

ravcio opened this issue Apr 3, 2024 · 6 comments
Assignees
Labels
Bug IO Data IO issues that don't fit into a more specific label

Comments

@ravcio
Copy link

ravcio commented Apr 3, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
pd.read_csv('2022_01_k.zip')
# ValueError: Multiple files found in ZIP file. Only one file per ZIP: ['k_d_01_2022.csv', 'k_d_t_01_2022.csv']

Issue Description

Handle to 2022_01_k.zip is left open after failed read attempt (resource leak).

Expected Behavior

Handle to file 2022_01_k.zip is closed. File can be renamed/deleted on Windows.

Installed Versions

INSTALLED VERSIONS

commit : bdc79c1
python : 3.11.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 85 Stepping 4, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en
LOCALE : Polish_Poland.1250

pandas : 2.2.1
numpy : 1.26.3
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 69.2.0
pip : 24.0
Cython : None
pytest : None
hypothesis : None
sphinx : 5.0.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 5.1.0
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.20.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.8.0
numba : None
numexpr : 2.8.7
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 15.0.1
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : 0.9.0
xarray : None
xlrd : None
zstandard : 0.19.0
tzdata : 2023.3
qtpy : 2.4.1
pyqt5 : None

@ravcio ravcio added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 3, 2024
@ravcio ravcio changed the title BUG: read_csv file handle not closed after BUG: read_csv file handle not closed after error Apr 3, 2024
@jsjeon-um
Copy link

take

@twoertwein twoertwein added IO Data IO issues that don't fit into a more specific label and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 4, 2024
@twoertwein
Copy link
Member

@jsjeon-um Feel free to ping me if you have questions or when you have a PR!

@abeltavares
Copy link
Contributor

@jsjeon-um are you still working on it?
can i take it?

@tev-dixon
Copy link
Contributor

I'll take this.

@tev-dixon
Copy link
Contributor

I was not able to recreate this bug. I tried on both Windows 10/11 and Linux systems with multiple zip files. Looking over the relevant code, I didn't see anything glaringly incorrect. Can anyone else reproduce this?

@twoertwein
Copy link
Member

twoertwein commented Dec 8, 2024

I was not able to recreate this bug. I tried on both Windows 10/11 and Linux systems with multiple zip files. Looking over the relevant code, I didn't see anything glaringly incorrect. Can anyone else reproduce this?

The bug should still exists as we still raise without first cleaning up opened file handlers:

raise ValueError(f"Zero files found in ZIP file {path_or_buf}")

You probably need to enable ResoruceWarnings, for example, with python -W default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

5 participants