Skip to content

CI: azure timeouts #43643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Sep 21, 2021
Merged

CI: azure timeouts #43643

merged 8 commits into from
Sep 21, 2021

Conversation

mzeitlin11
Copy link
Member

@mzeitlin11 mzeitlin11 commented Sep 18, 2021

Based on the logging in #43611, in both timeout cases, the last test gw0 ran was before the hypothesis test in test_parse_dates (this is a giant test - 56 parameterizations, hypothesis does 100 examples by default).

Not sure why this would be the cause, couldn't find any issues about it that might explain a deadlock or something like that, but maybe this will help? Regardless of fixing timeouts, it makes sense for these to be slow

@mzeitlin11 mzeitlin11 added Testing pandas testing functions or related to the test suite CI Continuous Integration labels Sep 18, 2021
@mzeitlin11
Copy link
Member Author

Maybe something similar to HypothesisWorks/hypothesis#2340?

@mzeitlin11
Copy link
Member Author

Hmm this didn't work - new guess - the logging seems to stop around parser tests - maybe the fact that we use the same parser object across tests can cause deadlock (or issue is some mutation is happening?)

@mzeitlin11 mzeitlin11 marked this pull request as draft September 18, 2021 19:03
@mzeitlin11 mzeitlin11 marked this pull request as ready for review September 18, 2021 22:56
@pandas-dev pandas-dev deleted a comment from azure-pipelines bot Sep 19, 2021
@pandas-dev pandas-dev deleted a comment from azure-pipelines bot Sep 19, 2021
@pandas-dev pandas-dev deleted a comment from azure-pipelines bot Sep 19, 2021
@pandas-dev pandas-dev deleted a comment from azure-pipelines bot Sep 19, 2021
@pandas-dev pandas-dev deleted a comment from azure-pipelines bot Sep 19, 2021
@pandas-dev pandas-dev deleted a comment from azure-pipelines bot Sep 19, 2021
@mzeitlin11
Copy link
Member Author

mzeitlin11 commented Sep 19, 2021

We're at ~5 in a row on azure not timing out. Will keep running azure, but this is ready for review from my side. Summary of changes:

  1. Mark hypothesis tests as slow
  2. For parser fixtures, ensure we generate new objects instead of sharing them
  3. skip pyarrow tests which can deadlock, xref CI/BUG: pyarrow read_csv deadlock #43650
    (EDIT: just noticed more timeout cases on deadlock in CI: debug azure timeouts #43611. Another option might be to just replace all pyarrow xfails with skips)

@mzeitlin11 mzeitlin11 changed the title CI: mark hypothesis tests as slow CI: azure timeouts Sep 19, 2021
@jreback
Copy link
Contributor

jreback commented Sep 20, 2021

Mark hypothesis tests as slow

is there some option that can set on hypthosesis instead? the point of these tests is to find holes in our tests, which are almost all fast, so this just remove this entirely (which maybe ok). but then we should just do that.

@@ -16,6 +16,7 @@
import pandas._testing as tm

xfail_pyarrow = pytest.mark.usefixtures("pyarrow_xfail")
skip_pyarrow = pytest.mark.usefixtures("pyarrow_skip")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some comments here on when to xfail vs skip

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will do

@mzeitlin11
Copy link
Member Author

is there some option that can set on hypthosesis instead? the point of these tests is to find holes in our tests, which are almost all fast, so this just remove this entirely (which maybe ok). but then we should just do that.

We could set fewer examples to run (but at some point that defeats the purpose of hypothesis). I thought on running on slow was a good compromise since it will still run on some builds, just fewer.

Regardless, the important change here is skipping pyarrow for the hypothesis parse_dates test - will just remove the slow markers for now since it turned out pyarrow was the cause of the timeout, not the hypothesis tests sometimes running extremely slowly

@jreback jreback added this to the 1.4 milestone Sep 20, 2021
@jreback
Copy link
Contributor

jreback commented Sep 20, 2021

kk lgtm. ping when ready to merge (or just go ahead)

@jreback
Copy link
Contributor

jreback commented Sep 20, 2021

note that I think we are seeing something similar on 1.3.4, but we didn't merge the pyarrow csv reader so prob something else.

@mzeitlin11
Copy link
Member Author

note that I think we are seeing something similar on 1.3.4, but we didn't merge the pyarrow csv reader so prob something else.

Good to know - certainly possible there are other potential timeout-causing issues unrelated to pyarrow. Will keep running azure pipelines on #43611 to see if anything else comes up. This should at least make timeouts less frequently hopefully

@jreback jreback merged commit f9b6290 into pandas-dev:master Sep 21, 2021
@jreback
Copy link
Contributor

jreback commented Sep 21, 2021

thanks @mzeitlin11 nice improvement here. Yeah let's keep an eye on 1.3.4

@mzeitlin11 mzeitlin11 deleted the mark_hypothesis_slow branch September 21, 2021 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants