Skip to content

CI: Refactor sdist workflow and add setup-python action #46538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

jonashaag
Copy link
Contributor

Smaller version of #46493 according to suggestion in #46493 (comment)

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@mroeschke
Copy link
Member

Sorry my comment wasn't clear - I was hoping that each bullet point would be a separate PR.

@jonashaag jonashaag changed the title Refactor CI CI: Refactor sdist workflow and add setup-python action Mar 29, 2022
@jonashaag
Copy link
Contributor Author

@mroeschke recycled this PR for sdist refactoring. The action code also covers some more cases not relevant to sdist but relevant to future PRs.

build:
if: ${{ github.event.label.name == 'Build' || contains(github.event.pull_request.labels.*.name, 'Build') || github.event_name == 'push'}}
sdist:
#if: ${{ github.event.label.name == 'Build' || contains(github.event.pull_request.labels.*.name, 'Build') || github.event_name == 'push'}}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to undo this once GHA is green.

@mroeschke
Copy link
Member

Is the main sdist refactor to add the setup python action or are there independent cleanups that can be separate from the setup action?

@jonashaag
Copy link
Contributor Author

Good point, I've pushed a reduced version.


# Python deps
pip
setuptools<60.0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm don't love that these deps (pandas + test) are also specified in yaml files as well (aware over multiple yaml files as well). Any idea on how to combine them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the refactoring they were part of an inline pip install ....

We might be able to auto-generate them from a minimal environment.yml file but not sure if that's better than having a little duplication.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Could this be maintained in a separate file and modified here with the date to invalidate the cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do it, but we'd still need to add the current timestamp (or something derived from it, like current month) to the file because otherwise the cache will only be invalidated when we change the file, which we don't do on a regular basis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: For setup-python a stale cache just means slower installs, not older versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense; this block can create a copy of the environment file and add a date. I think it's just easier to manage/know that our dependencies are in files and not as ad hoc like this.

@jonashaag
Copy link
Contributor Author

@mroeschke let me know if I can make this PR smaller to ease reviewing.

@jonashaag jonashaag mentioned this pull request Apr 2, 2022
4 tasks
@jreback jreback added Build Library building on various platforms CI Continuous Integration labels Apr 4, 2022
# TODO: GH#44980 https://github.com/pypa/setuptools/issues/2941
- name: Create temporary requirements.txt
run: |
# Drop cache at least once per month
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we cache this at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it saves a couple of minutes but we can also add it later.

run: |
cat requirements.txt
# TODO https://github.com/numpy/numpy/issues/21196
bits32=$(python -c 'import sys; print(int(sys.maxsize > 2**32))')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umm what is this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required for 32 bit builds but we can add that back later

@datapythonista
Copy link
Member

Thanks for all the nice work with the CI @jonashaag. I see you've got several PRs open, and looks like most of them too big to be merged. Before things start to become too messy, I would suggest if you can create an issue for the refactoring of the CI, with the tasks that you've got in mind, or the final status you're aiming to achieve. And then you can open atomic PRs (without mixing things that can be addressed independently). Also, would be nice to keep closing PRs that aren't relevant anymore (so reviewers don't need to navigate and find the ones that are ready mixed with some that don't require attention anymore).

I understand it may be a bit frustrating to work this way. But in a project as big as pandas, we need to make sure that:

  • Everybody, not only you, can properly understand what you're trying to achieve (in a single PR, and in the whole refactoring)
  • PRs are easy to review (and it doesn't take a huge amount of time to review, since every feedback iteration would be very time consuming)
  • When things go wrong in the future because of changes, it's easy to understand what happened, there are issues explaining what was the general idea, and PRs are atomic, so they can be understood better, and reverted if needed, without having to revert unrelated changes.

Thanks!

@jonashaag jonashaag mentioned this pull request Apr 5, 2022
6 tasks
@github-actions
Copy link
Contributor

github-actions bot commented May 6, 2022

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label May 6, 2022
@mroeschke
Copy link
Member

Is this still relevant @jonashaag or has this been superseded by your other PRs?

@jonashaag
Copy link
Contributor Author

Superseded.

@jonashaag jonashaag closed this Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms CI Continuous Integration Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants