Skip to content

Change Concurrency Group Name #403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 1, 2022
Merged

Change Concurrency Group Name #403

merged 3 commits into from
Oct 1, 2022

Conversation

FoamyGuy
Copy link
Contributor

I suspect that the concurrency group name being folder-images in both this repos images.yml as well as this one over in the Learn repo having the same group name: https://github.com/adafruit/Adafruit_Learning_System_Guides/blob/4aab0129c3fce57580839647b1bd948c185dd13b/.github/workflows/images.yml#L11 is causing these actions runs to get (sort of) cancelled. This page shows the list of supposedly cancelled actions tasks for building the images:
https://github.com/adafruit/Adafruit_CircuitPython_Bundle/actions?query=workflow%3A%22Generate+folder+images%22++

The all report a similar error:

update-images
Canceling since a higher priority waiting request for 'folder-images' exists

Strangely though they also seem to be completing successfully all of the actual tasks that are in the workflow. The images are getting generated and committed as far as I can tell even though these are indicating cancelled.

My current working theory is that both this repo and the Learn guide repo have a cron / schedule task that is set to run at the same time 0 10 * * * and both have the same name folder-images so I think these two things trigger at the same time and "cancel each other out" at least some parts of Github are indicating cancelled on them.

I tested this in my own forks of these two repos and found that I was able to get both workflows to run and show successfully when scheduled at the same time by using a different group name for one of them.

The successful scheduled runs are here:

In order to test them without waiting until 10:00 am UTC I temporarily changed the cron times to something different that was a few minutes ahead of the time I made the commits so I could watch them run.

These cancelled runs were discussed on Discord by @jepler, @kattni, and myself on 9/29 https://discord.com/channels/327254708534116352/327298996332658690/1025138001627578469

I can't really rule out some other difference between the adafruit fork and my fork that could be factoring in as well. I was able to get successful runs in my forks. But the true test would be merging this (or a change to the group name in Learn repo) and then waiting until the next day at 10 UTC to see if they show full success.

@FoamyGuy FoamyGuy requested a review from a team September 30, 2022 23:21
@FoamyGuy
Copy link
Contributor Author

FoamyGuy commented Oct 1, 2022

Maybe this theory is not the real reason.The actions concurrency docs state:

When a concurrent job or workflow is queued, if another job or workflow using the same concurrency group in the repository is in progress, the queued job or workflow will be pending. Any previously pending job or workflow in the concurrency group will be canceled. To also cancel any currently running job or workflow in the same concurrency group, specify cancel-in-progress: true.

So they are intended to be limited to a per repository level. That means these two repos shouldn't be interferring with each other. But both do show this odd successfully completed, but also cancelled status.

@jepler
Copy link
Contributor

jepler commented Oct 1, 2022

The fact that it disagrees with the documentation doesn't mean we shoudn't try it...

@jepler jepler merged commit a7c61b4 into adafruit:main Oct 1, 2022
@jepler
Copy link
Contributor

jepler commented Oct 14, 2022

I had a communication from github about this:

Hi Jeff,

Thank you for your patience while the team investigated this.

The job was canceled because in the workflow file, cancel-in-progress is configured to true, so if a workflow is triggered (in this case it was on schedule) while the existing run is still running, it will attempt to cancel this.

However, what happened was the job was not "cancelable" since it is already completed so the job appears completed; however, the run is marked as canceled accordingly.

We can see that subsequent runs do not have this issue, could you let us know if you change d anything?

Regards,
Maureen.

I responded:

We made a speculative change in Adafruit_CircuitPython_Bundle the same day that the behavior changed: #403 -- merged at October 1 5:09 PM CDT, next workflow at October 2 5:03 AM CDT showed as successful rather than canceled.

My colleague noted that the same concurrency group name was being used in two of our repos and additionally the dispatch happened with exactly the same schedule (0 10 * * *)

Is it possible that, even though it's not how we read the documentation, the workflows in different repositories of an organization with equal concurrency group names can cancel one another?

It is an interesting coincidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants