-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CI/BUG: pyarrow read_csv deadlock #43650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We can increase the minimum required version of pyarrow for the CSV reading functionality? It might also be good to make a reproducible example to report to Arrow. Although I suppose it is fixed now (given it only happens on older pyarrow versions), it can still be useful to add it as a test over there. |
Makes sense - from initial testing I know the issue is at least present for |
I'll try to test this some more. It is also possible that pyarrow is getting stuck because of the TextIoWrapper that we are using on our side to force pyarrow to read StringIO's, which would be a bug on our side. |
@lithomas1 @mzeitlin11 did this ever get sorted out? |
Now that our minimum version is 6.0 I believe we shouldn't hit this issue anyone as IIRC I was experiencing this with pyarrow 2.0 and had skipped those version in the CI due to the deadlock. Closing since we haven't seen this in a while but we can reopen if this shows up again |
xref #43611, #43643
When trying to figure out azure timeout issues, deadlock appeared to be occurring in parser code, so
pyarrow
makes sense as the culprit. Seems like tests with weird input cause issues, for example some of theparse_dates
tests, or for a specific reproducer the test:pandas/tests/io/parser/common/test_ints.py::test_outside_int64_uint64_range
On current
pyarrow
I can't reproduce, but azure uses 0.17.0, with which can reproduce a deadlock (just running the commandpandas/tests/io/parser/common/test_ints.py::test_outside_int64_uint64_range
) on macOS. Doesn't happen consistently, but will deadlock (to the point that need to sigkill to stop, which explains whypytest-timeout
didn't catch it).cc @lithomas1 if any thoughts here
The text was updated successfully, but these errors were encountered: