Skip to content

Refactor CI #46493

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Refactor CI #46493

wants to merge 4 commits into from

Conversation

jonashaag
Copy link
Contributor

This PR 1 from this list #46346 (comment)

cc @mroeschke

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@simonjayhawkins simonjayhawkins added the CI Continuous Integration label Mar 24, 2022
@mroeschke
Copy link
Member

mroeschke commented Mar 24, 2022

Looks like only 1 check ran

- Drop Azure (move to GHA).
- Use mamba-org/provision-with-micromamba to set up Conda envs.
- Cache Conda envs.
- Use hendrikmuhs/ccache-action to cache C compiliation.
- Add Windows 32 bit build.
- Drop Ubuntu 32 bit build.
- Factor out common GHA code into reusable actions.
@jonashaag
Copy link
Contributor Author

GHA has serious problems right now

@lithomas1
Copy link
Member

Can we hold off on removing Azure pipelines for now? Right now, jobs queue for very long times >2 hours(sometimes up to 4 hours) when lots of PRs are merged. I'm trying to clear out some redundant builds right now, but that will probably take me a while before I get the builds down to a reasonable number(~20 jobs per build).

build:
if: ${{ github.event.label.name == 'Build' || contains(github.event.pull_request.labels.*.name, 'Build') || github.event_name == 'push'}}
sdist:
#if: ${{ github.event.label.name == 'Build' || contains(github.event.pull_request.labels.*.name, 'Build') || github.event_name == 'push'}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this commented? sdists are not supposed to run on every PR as per #43745

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a debugging leftover

using: composite
steps:
# TODO: GH#44980 https://github.com/pypa/setuptools/issues/2941
- name: Create temporary requirements.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for this? I think we can pin setuptools in the main file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required to use setup-python's caching.

Copy link
Member

@lithomas1 lithomas1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Generally like the changes in this PR, however the diff still feels a bit too large and clunky to review. I left a few initial comments, but will have try to get a deeper review in later.

I would recommend splitting the Azure changes out pending discussion, and to only migrate a couple of the workflows at a time to keep the diffs smaller.

@@ -0,0 +1,62 @@
name: macOS and Windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to pair together macOS and Ubuntu together in the future when we do drop azure.

@mroeschke
Copy link
Member

Agreed that this PR seems to tackle too much still for a thorough review, especially for CI changes where we need to be careful about things passing. Generally it's easier if 1 PR tackles 1 general issue/theme

@jonashaag
Copy link
Contributor Author

Right now, jobs queue for very long times >2 hours(sometimes up to 4 hours) when lots of PRs are merged

Is there any way to change to a large GHA plan? Maybe through some GH open source sponsoring program or some Pandas sponsor?

Re even more splitting, would it be a viable option to ship most of the shared code in .github/actions but only a single (or so) changed workflow? That would reduce the size of the changes substantially but the PR might include changes not relevant to the changed workflows (because only relevant to changes in workflows in a subsequent PR).

@jonashaag
Copy link
Contributor Author

Right now, jobs queue for very long times >2 hours(sometimes up to 4 hours) when lots of PRs are merged

Is there any way to change to a large GHA plan? Maybe through some GH open source sponsoring program or some Pandas sponsor?

Also maybe we can bail out earlier if there are test failures. I thought that maybe w e could come up with a very small test of "pre tests" that we can run in eg. 60 s and that would exit the test run if failed.

@mroeschke
Copy link
Member

Is there any way to change to a large GHA plan?

We could probably reach out to Github to see what's possible. Probably good to bring up at the next developer meeting. FWIW I don't think the long queue times are a huge deal. Minus the redundancy of testing which wastes compute resources, development is fairly asynchronous over longer time spans (especially reviewer availability) without major deadlines so the waiting is not major IMO.

Re even more splitting

Some of the changes I see that could be a single PR

  • pytest-asyncio>=0.17
  • Bumping actions/checkout
  • bash -el
  • Reordering the yaml of the include jobs
  • Refactoring sdist workflow
  • setup actions
  • build action
  • test action

@jonashaag
Copy link
Contributor Author

Probably good to bring up at the next developer meeting.

Are you suggesting that I should join the meeting (where can I find info?) or are you going to bring it up?

FWIW I don't think the long queue times are a huge deal.

I think they are because they create a very long feedback loop for contributors. It's fine if tests take a long time but I think it's worth investing some effort into reducing the latency between a push and the first test being executed.

Some of the changes I see that could be a single PR

Working on a new PR!

@jreback
Copy link
Contributor

jreback commented Mar 28, 2022

@jonashaag the CI is very large on pandas and we are already using a large plan. it is unlikey we are going to be able to change anything here. The cycle times are what they are.

@lithomas1
Copy link
Member

Right now, jobs queue for very long times >2 hours(sometimes up to 4 hours) when lots of PRs are merged

Is there any way to change to a large GHA plan? Maybe through some GH open source sponsoring program or some Pandas sponsor?

Also maybe we can bail out earlier if there are test failures. I thought that maybe w e could come up with a very small test of "pre tests" that we can run in eg. 60 s and that would exit the test run if failed.

Going along a similar vein, maybe we can try making the slow CI depend on non slow CI and see if that helps(we can cache the built pandas from the non-slow CI, and install from that for the slow CI).

@jonashaag
Copy link
Contributor Author

What's really curious is that the GHA machines are SO incredibly slow. On my M1 machine with no pytest-xdist the full test suite takes 8 minutes, while on GHA with 2 cores it takes ~ 45 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants