Skip to content

BUG: Pandas should not swallow exceptions when close()ing a write handle #47136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
batterseapower opened this issue May 27, 2022 · 3 comments · Fixed by #47165
Closed
3 tasks done

BUG: Pandas should not swallow exceptions when close()ing a write handle #47136

batterseapower opened this issue May 27, 2022 · 3 comments · Fixed by #47165
Labels
Bug Error Reporting Incorrect or improved errors from pandas IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@batterseapower
Copy link
Contributor

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

pd.to_pickle(large_python_object, '/mnt/some_nfs_mount')

Issue Description

IOHandles.close (https://github.com/pandas-dev/pandas/blob/v1.4.2/pandas/io/common.py#L119) throws away any exceptions raised by calling close() on the handle it is managing. When writing to certain file systems (in particular NFS mounts), write errors caused by e.g. runnning out of disk space may not be raised at the point you actually do a write() but rather get deferred to the close(). Therefore, by ignoring exceptions raised by close() it's possible for a Pandas IO utility function to silently write a truncated file.

Expected Behavior

Exceptions raised by close() should be propagated to the user for appropriate handling. You may optionally want to continue to ignore exceptions raised by handles open in read mode.

Installed Versions

INSTALLED VERSIONS ------------------ commit : 66e3805 python : 3.8.12.final.0 python-bits : 64 OS : Linux OS-release : 4.18.0-305.19.1.el8_4.x86_64 Version : #1 SMP Wed Sep 15 15:39:39 UTC 2021 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8

pandas : 1.3.5
numpy : 1.20.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : None
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.5.1
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.0.2
IPython : 7.31.1
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.2
fsspec : 2022.01.0
fastparquet : None
gcsfs : None
matplotlib : 3.5.0
numexpr : 2.8.1
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 4.0.1
pyxlsb : None
s3fs : 2022.01.0
scipy : 1.7.3
sqlalchemy : None
tables : None
tabulate : None
xarray : 0.20.1
xlrd : 2.0.1
xlwt : None
numba : 0.54.1

@batterseapower batterseapower added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 27, 2022
@twoertwein twoertwein added the IO Data IO issues that don't fit into a more specific label label May 27, 2022
@twoertwein
Copy link
Member

I'm to blame for this try-except. Before sharing most of the file opening code across most read/to functions, some of them wrapped it in a try-except (not sure why). I like the idea of removing the try-except (even for reading).

@mroeschke mroeschke added Error Reporting Incorrect or improved errors from pandas and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 27, 2022
@simonjayhawkins
Copy link
Member

I'm to blame for this try-except.

recently? is this a regression? should this be milestoned 1.4.3? (trivial bug fixes can also be included in patch releases too)

@twoertwein
Copy link
Member

That specific try-catch is there since 1.2 but to_json, read_csv, to_pickle, read_pickle, read_sas (and maybe more) had a try-catch themselves before sharing the file opening/closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas IO Data IO issues that don't fit into a more specific label
Projects
None yet
4 participants