Skip to content

AMI housekeeper deletes all AMIs #4571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nickatct opened this issue Apr 29, 2025 · 9 comments
Open

AMI housekeeper deletes all AMIs #4571

nickatct opened this issue Apr 29, 2025 · 9 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@nickatct
Copy link

Since the migration of the launch template using SSM for AMI lookup, the AMI housekeeper function will delete all existing AMIs.

The filter here uses ami-id instead of ami_id, the latter of which is in the parameter name. As a result, the AMI in-use will be mistakenly considered not-in-use, thus deleted.

In addition, the ssmParameterNames in options parameter are not used. Perhaps the specified names should be used in the filter.

@npalm npalm added bug Something isn't working help wanted Extra attention is needed labels Apr 30, 2025
@npalm
Copy link
Member

npalm commented Apr 30, 2025

Thx for tyour repot, the ami housekeper was not taken into account by rewriting the launch template logic in #4517

With the move to SSM for the template, the ami cleaner should check which AMI is via SSM referred in the template.

@npalm
Copy link
Member

npalm commented Apr 30, 2025

For now there will be no other solution than just disable the AMI cleaner.

@npalm npalm pinned this issue Apr 30, 2025
@npalm
Copy link
Member

npalm commented Apr 30, 2025

@nickatct right now I have no time to make a proper implementation and test it. But in case you have time it would be very helpfull you can check this update cleanup functions:

If you set the logger for the lambda to debug it should also log relevant inforamtion about which images it found in usage, and not deleting.

@npalm
Copy link
Member

npalm commented Apr 30, 2025

@nickatct could be the case you have to grant permission to the housekeeper to get the ssm paramater.

@nickatct
Copy link
Author

@nickatct could be the case you have to grant permission to the housekeeper to get the ssm paramater.

@npalm The housekeeper lambda has permission to get the ssm parameter. The problem is ParameterFilters uses ami-id instead of ami_id to filter. (hyphen vs underscore). The parameter name is /github-action-runners/github-actions/runners/config/ami_id. If you put ami_id on line 198, I think the housekeeper will work properly.

@npalm
Copy link
Member

npalm commented May 1, 2025

I think the bug is that it is not respecting ami refered via the template. So therefor I do first a lookup of the active template, next loop-up the refered ami via ssm.

@npalm
Copy link
Member

npalm commented May 1, 2025

I had a more detail look at the code. I think there are some bugs in the code

The paramaters Nbame inject is totally ignored, so it is only working for parameters with wmi-id in the name. Which means I think the problem is not introduced in the PR / release refered but there from day 1.

@nickatct
Copy link
Author

nickatct commented May 1, 2025

It was working for me because the ami id was in the launch template directly. It might have been an issue not surfaced before. The parameter name uses underscore _ at creation.

@mattgodbolt
Copy link

I think it looks like we just got hit by this too. We upgraded after a decent while and our (in-use) images all got deleted :\ I've disabled it for now, but ... this seems like kinda a high priority issue to look at?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants