Skip to content

fix: only tag spot requests if no on-demand fallback #4585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion modules/runners/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -206,8 +206,12 @@ resource "aws_launch_template" "runner" {
)
}

# We avoid including the "spot-instances-request" tag_specifications block when on_demand_failover_for_errors is defined,
# because when using on-demand fallback, the spot instance request resource is not created and thus the tags would not apply.
# Additionally, tagging spot requests via the CreateFleetCommand in the Lambda function does not work as expected,
# so we rely on Terraform to manage these tags only when spot is exclusively used without on-demand failover.
dynamic "tag_specifications" {
for_each = var.instance_target_capacity_type == "spot" ? [1] : [] # Include the block only if the value is "spot"
for_each = var.instance_target_capacity_type == "spot" && var.enable_on_demand_failover_for_errors == null ? [1] : [] # Include the block only if the value is "spot" and on_demand_failover_for_errors is not enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will solve the problem, but will avoid to tag the spot request if ondeamdn failover is active. A better place would be the lambda in my point of view.

const instancesOnDemand = await createRunner({

Would you have time to provide a fix in the lambda?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my initial approach but I couldn't find a way to overwrite the tags directly in the lambda.

I'll take a second look to see if I can find a solution

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we can apply a simple fix on the Lambda

Since the tags are defined in the launch template, it’s not possible to override them using the CreateFleetCommand

We could remove the spot-instances-request tags from the launch template and set them only in the Lambda’s CreateFleetCommand, but in that case, we lose access to the tags defined in the Terraform configuration

I don’t see a simple solution for now, do you have any thoughts on this?

Thanks

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I arrived to the same conclusion when I investigated a workaround.

Copy link
Member

@npalm npalm May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider to move the tagging spot request the runner.ts instead of setting it in the template. In that case we can do it properly based if a spot instance is requested or not. Already some tags are set here. Would you like to give it a shot?

see

TagSpecifications: [
{
ResourceType: 'instance',
Tags: tags,
},
{
ResourceType: 'volume',
Tags: tags,
},
],
Type: 'instant',
});

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npalm I tried this approach, but we would lose all the tags defined in the Terraform module, only those tags would be applied

  const tags = [
    { Key: 'ghr:Application', Value: 'github-action-runner' },
    { Key: 'ghr:created_by', Value: runnerParameters.numberOfRunners === 1 ? 'scale-up-lambda' : 'pool-lambda' },
    { Key: 'ghr:Type', Value: runnerParameters.runnerType },
    { Key: 'ghr:Owner', Value: runnerParameters.runnerOwner },
  ];

However, I can try fetching the tags from the launch template first and then applying them to the spot-instances-request via the TagSpecifications

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the changes, but I’m not really sure how to test it in real conditions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will run some tests to check spot request are correctly tagged. The case spot is not available is not testable as far I know.

content {
resource_type = "spot-instances-request"
tags = merge(
Expand Down