Skip to content
This repository was archived by the owner on Jan 16, 2025. It is now read-only.

Commit 27e974d

Browse files
npalmScottGuymer
andauthored
feat: Replace run instance API by create fleet API (#1556)
* Replace creating instance by the fleet api * Enable create fleet API * Enable create fleet API * refactor * refactor * fix tests * cleanup * refactor * refactor, prepare for creating muliple runners * revert and typo * revert and typo * Adjust for ARM * refactor * Update README.md Co-authored-by: Scott Guymer <[email protected]> * Update modules/runners/scale-up.tf Co-authored-by: Scott Guymer <[email protected]> * Update modules/runners/variables.tf Co-authored-by: Scott Guymer <[email protected]> * Update modules/runners/variables.tf Co-authored-by: Scott Guymer <[email protected]> * Update variables.tf Co-authored-by: Scott Guymer <[email protected]> * Update README.md Co-authored-by: Scott Guymer <[email protected]> * review * typo Co-authored-by: Scott Guymer <[email protected]>
1 parent 7907984 commit 27e974d

31 files changed

+1005
-628
lines changed

Diff for: README.md

+9-12
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ To be able to support a number of use-cases the module has quite a lot configura
8787
- Linux vs Windows. you can configure the os types linux and win. Linux will be used by default.
8888
- Re-use vs Ephemeral. By default runners are re-used for till detected idle, once idle they will be removed from the pool. To improve security we are introducing ephemeral runners. Those runners are only used for one job. Ephemeral runners are only working in combination with the workflow job event. We also suggest to use a pre-build AMI to improve the start time of jobs.
8989
- GitHub cloud vs GitHub enterprise server (GHES). The runner support GitHub cloud as well GitHub enterprise service. For GHES we rely on our community to test and support. We have no possibility to test ourselves on GHES.
90+
- Spot vs on-demand. The runners using either the EC2 spot or on-demand life cycle. Runners will be created via the AWS [CreateFleet API](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_CreateFleet.html). The module (scale up lambda) will request an instance via the create fleet API in one of the subnets and matching one of the specified instance types.
9091

9192

9293
#### ARM64 support via Graviton/Graviton2 instance-types
@@ -325,15 +326,7 @@ The following sub modules are optional and are provided as example or utility:
325326

326327
### ARM64 configuration for submodules
327328

328-
When not using the top-level module and specifying an `a1`, `t4g` or `*6g*` (6th-gen Graviton2) `instance_type`, the `runner-binaries-syncer` and `runners` submodules need to be configured appropriately for pulling the ARM64 GitHub action runner binary and leveraging the arm64 AMI for the runners.
329-
330-
When configuring `runner-binaries-syncer`
331-
332-
- _runner_architecture_ - set to `arm64`, defaults to `x64`
333-
334-
When configuring `runners`
335-
336-
- _ami_filter_ - set to `["amzn2-ami-hvm-2*-arm64-gp2"]`, defaults to `["amzn2-ami-hvm-2.*-x86_64-ebs"]`
329+
When using the top-level module configure `runner_architecture = arm64` and ensure the list of `instance_types` matches. When not using the top-level ensure the bot properties are set on the submodules.
337330

338331
## Debugging
339332

@@ -401,9 +394,12 @@ In case the setup does not work as intended follow the trace of events:
401394
| <a name="input_ghes_url"></a> [ghes\_url](#input\_ghes\_url) | GitHub Enterprise Server URL. Example: https://github.internal.co - DO NOT SET IF USING PUBLIC GITHUB | `string` | `null` | no |
402395
| <a name="input_github_app"></a> [github\_app](#input\_github\_app) | GitHub app parameters, see your github app. Ensure the key is the base64-encoded `.pem` file (the output of `base64 app.private-key.pem`, not the content of `private-key.pem`). | <pre>object({<br> key_base64 = string<br> id = string<br> webhook_secret = string<br> })</pre> | n/a | yes |
403396
| <a name="input_idle_config"></a> [idle\_config](#input\_idle\_config) | List of time period that can be defined as cron expression to keep a minimum amount of runners active instead of scaling down to 0. By defining this list you can ensure that in time periods that match the cron expression within 5 seconds a runner is kept idle. | <pre>list(object({<br> cron = string<br> timeZone = string<br> idleCount = number<br> }))</pre> | `[]` | no |
397+
| <a name="input_instance_allocation_strategy"></a> [instance\_allocation\_strategy](#input\_instance\_allocation\_strategy) | The allocation strategy for spot instances. AWS recommends to use `capacity-optimized` however the AWS default is `lowest-price`. | `string` | `"lowest-price"` | no |
398+
| <a name="input_instance_max_spot_price"></a> [instance\_max\_spot\_price](#input\_instance\_max\_spot\_price) | Max price price for spot intances per hour. This variable will be passed to the create fleet as max spot price for the fleet. | `string` | `null` | no |
404399
| <a name="input_instance_profile_path"></a> [instance\_profile\_path](#input\_instance\_profile\_path) | The path that will be added to the instance\_profile, if not set the environment name will be used. | `string` | `null` | no |
405-
| <a name="input_instance_type"></a> [instance\_type](#input\_instance\_type) | [DEPRECATED] See instance\_types. | `string` | `"m5.large"` | no |
406-
| <a name="input_instance_types"></a> [instance\_types](#input\_instance\_types) | List of instance types for the action runner. Defaults are based on runner\_os (amzn2 for linux and Windows Server Core for win). | `list(string)` | `null` | no |
400+
| <a name="input_instance_target_capacity_type"></a> [instance\_target\_capacity\_type](#input\_instance\_target\_capacity\_type) | Default lifecycle used for runner instances, can be either `spot` or `on-demand`. | `string` | `"spot"` | no |
401+
| <a name="input_instance_type"></a> [instance\_type](#input\_instance\_type) | [DEPRECATED] See instance\_types. | `string` | `null` | no |
402+
| <a name="input_instance_types"></a> [instance\_types](#input\_instance\_types) | List of instance types for the action runner. Defaults are based on runner\_os (amzn2 for linux and Windows Server Core for win). | `list(string)` | <pre>[<br> "m5.large",<br> "c5.large"<br>]</pre> | no |
407403
| <a name="input_job_queue_retention_in_seconds"></a> [job\_queue\_retention\_in\_seconds](#input\_job\_queue\_retention\_in\_seconds) | The number of seconds the job is held in the queue before it is purged | `number` | `86400` | no |
408404
| <a name="input_key_name"></a> [key\_name](#input\_key\_name) | Key pair name | `string` | `null` | no |
409405
| <a name="input_kms_key_arn"></a> [kms\_key\_arn](#input\_kms\_key\_arn) | Optional CMK Key ARN to be used for Parameter Store. This key must be in the current account. | `string` | `null` | no |
@@ -414,14 +410,15 @@ In case the setup does not work as intended follow the trace of events:
414410
| <a name="input_log_level"></a> [log\_level](#input\_log\_level) | Logging level for lambda logging. Valid values are 'silly', 'trace', 'debug', 'info', 'warn', 'error', 'fatal'. | `string` | `"info"` | no |
415411
| <a name="input_log_type"></a> [log\_type](#input\_log\_type) | Logging format for lambda logging. Valid values are 'json', 'pretty', 'hidden'. | `string` | `"pretty"` | no |
416412
| <a name="input_logging_retention_in_days"></a> [logging\_retention\_in\_days](#input\_logging\_retention\_in\_days) | Specifies the number of days you want to retain log events for the lambda log group. Possible values are: 0, 1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, and 3653. | `number` | `180` | no |
417-
| <a name="input_market_options"></a> [market\_options](#input\_market\_options) | Market options for the action runner instances. Setting the value to `null` let the scaler create on-demand instances instead of spot instances. | `string` | `"spot"` | no |
413+
| <a name="input_market_options"></a> [market\_options](#input\_market\_options) | DEPCRECATED: Replaced by `instance_target_capacity_type`. | `string` | `null` | no |
418414
| <a name="input_minimum_running_time_in_minutes"></a> [minimum\_running\_time\_in\_minutes](#input\_minimum\_running\_time\_in\_minutes) | The time an ec2 action runner should be running at minimum before terminated if not busy. | `number` | `null` | no |
419415
| <a name="input_redrive_build_queue"></a> [redrive\_build\_queue](#input\_redrive\_build\_queue) | Set options to attach (optional) a dead letter queue to the build queue, the queue between the webhook and the scale up lambda. You have the following options. 1. Disable by setting, `enalbed' to false. 2. Enable by setting `enabled` to `true`, `maxReceiveCount` to a number of max retries.` | <pre>object({<br> enabled = bool<br> maxReceiveCount = number<br> })</pre> | <pre>{<br> "enabled": false,<br> "maxReceiveCount": null<br>}</pre> | no |
420416
| <a name="input_repository_white_list"></a> [repository\_white\_list](#input\_repository\_white\_list) | List of repositories allowed to use the github app | `list(string)` | `[]` | no |
421417
| <a name="input_role_path"></a> [role\_path](#input\_role\_path) | The path that will be added to role path for created roles, if not set the environment name will be used. | `string` | `null` | no |
422418
| <a name="input_role_permissions_boundary"></a> [role\_permissions\_boundary](#input\_role\_permissions\_boundary) | Permissions boundary that will be added to the created roles. | `string` | `null` | no |
423419
| <a name="input_runner_additional_security_group_ids"></a> [runner\_additional\_security\_group\_ids](#input\_runner\_additional\_security\_group\_ids) | (optional) List of additional security groups IDs to apply to the runner | `list(string)` | `[]` | no |
424420
| <a name="input_runner_allow_prerelease_binaries"></a> [runner\_allow\_prerelease\_binaries](#input\_runner\_allow\_prerelease\_binaries) | Allow the runners to update to prerelease binaries. | `bool` | `false` | no |
421+
| <a name="input_runner_architecture"></a> [runner\_architecture](#input\_runner\_architecture) | The platform architecture of the runner instance\_type. | `string` | `"x64"` | no |
425422
| <a name="input_runner_as_root"></a> [runner\_as\_root](#input\_runner\_as\_root) | Run the action runner under the root user. Variable `runner_run_as` will be ingored. | `bool` | `false` | no |
426423
| <a name="input_runner_binaries_s3_sse_configuration"></a> [runner\_binaries\_s3\_sse\_configuration](#input\_runner\_binaries\_s3\_sse\_configuration) | Map containing server-side encryption configuration for runner-binaries S3 bucket. | `any` | `{}` | no |
427424
| <a name="input_runner_binaries_syncer_lambda_timeout"></a> [runner\_binaries\_syncer\_lambda\_timeout](#input\_runner\_binaries\_syncer\_lambda\_timeout) | Time out of the binaries sync lambda in seconds. | `number` | `300` | no |

Diff for: examples/ephemeral/main.tf

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
locals {
2-
environment = "ephemeraal"
2+
environment = "ephemeral"
33
aws_region = "eu-west-1"
44
}
55

@@ -60,7 +60,7 @@ module "runners" {
6060
# ami_owners = [data.aws_caller_identity.current.account_id]
6161

6262
# Enable logging
63-
# log_level = "debug"
63+
# log_level = "debug"
6464

6565
# Setup a dead letter queue, by default scale up lambda will kepp retrying to process event in case of scaling error.
6666
# redrive_policy_build_queue = {

Diff for: main.tf

+8-8
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ locals {
55
})
66

77
s3_action_runner_url = "s3://${module.runner_binaries.bucket.id}/${module.runner_binaries.runner_distribution_object_key}"
8-
runner_architecture = substr(var.instance_type, 0, 2) == "a1" || substr(var.instance_type, 0, 3) == "t4g" || substr(var.instance_type, 1, 2) == "6g" ? "arm64" : "x64"
98
github_app_parameters = {
109
id = module.ssm.parameters.github_app_id
1110
key_base64 = module.ssm.parameters.github_app_key_base64
@@ -91,13 +90,14 @@ module "runners" {
9190
s3_bucket_runner_binaries = module.runner_binaries.bucket
9291
s3_location_runner_binaries = local.s3_action_runner_url
9392

94-
runner_os = var.runner_os
95-
instance_type = var.instance_type
96-
instance_types = var.instance_types
97-
market_options = var.market_options
98-
block_device_mappings = var.block_device_mappings
93+
runner_os = var.runner_os
94+
instance_types = var.instance_types
95+
instance_target_capacity_type = var.instance_target_capacity_type
96+
instance_allocation_strategy = var.instance_allocation_strategy
97+
instance_max_spot_price = var.instance_max_spot_price
98+
block_device_mappings = var.block_device_mappings
9999

100-
runner_architecture = local.runner_architecture
100+
runner_architecture = var.runner_architecture
101101
ami_filter = var.ami_filter
102102
ami_owners = var.ami_owners
103103

@@ -169,7 +169,7 @@ module "runner_binaries" {
169169
distribution_bucket_name = "${var.environment}-dist-${random_string.random.result}"
170170

171171
runner_os = var.runner_os
172-
runner_architecture = local.runner_architecture
172+
runner_architecture = var.runner_architecture
173173
runner_allow_prerelease_binaries = var.runner_allow_prerelease_binaries
174174

175175
lambda_s3_bucket = var.lambda_s3_bucket

Diff for: modules/runner-binaries-syncer/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ No modules.
9191
| <a name="input_role_path"></a> [role\_path](#input\_role\_path) | The path that will be added to the role, if not set the environment name will be used. | `string` | `null` | no |
9292
| <a name="input_role_permissions_boundary"></a> [role\_permissions\_boundary](#input\_role\_permissions\_boundary) | Permissions boundary that will be added to the created role for the lambda. | `string` | `null` | no |
9393
| <a name="input_runner_allow_prerelease_binaries"></a> [runner\_allow\_prerelease\_binaries](#input\_runner\_allow\_prerelease\_binaries) | Allow the runners to update to prerelease binaries. | `bool` | `false` | no |
94-
| <a name="input_runner_architecture"></a> [runner\_architecture](#input\_runner\_architecture) | The platform architecture for the runner instance (x64, arm64), defaults to 'x64' | `string` | `"x64"` | no |
94+
| <a name="input_runner_architecture"></a> [runner\_architecture](#input\_runner\_architecture) | The platform architecture of the runner instance\_type. | `string` | `"x64"` | no |
9595
| <a name="input_runner_os"></a> [runner\_os](#input\_runner\_os) | The operating system for the runner instance (linux, win), defaults to 'linux' | `string` | `"linux"` | no |
9696
| <a name="input_server_side_encryption_configuration"></a> [server\_side\_encryption\_configuration](#input\_server\_side\_encryption\_configuration) | Map containing server-side encryption configuration. | `any` | `{}` | no |
9797
| <a name="input_syncer_lambda_s3_key"></a> [syncer\_lambda\_s3\_key](#input\_syncer\_lambda\_s3\_key) | S3 key for syncer lambda function. Required if using S3 bucket to specify lambdas. | `any` | `null` | no |

Diff for: modules/runner-binaries-syncer/variables.tf

+8-1
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,16 @@ variable "runner_os" {
6161
}
6262

6363
variable "runner_architecture" {
64-
description = "The platform architecture for the runner instance (x64, arm64), defaults to 'x64'"
64+
description = "The platform architecture of the runner instance_type."
6565
type = string
6666
default = "x64"
67+
validation {
68+
condition = anytrue([
69+
var.runner_architecture == "x64",
70+
var.runner_architecture == "arm64",
71+
])
72+
error_message = "`runner_architecture` value not valid, valid values are: `x64` and `arm64`."
73+
}
6774
}
6875

6976
variable "logging_retention_in_days" {

Diff for: modules/runners/README.md

+4-1
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,10 @@ No modules.
128128
| <a name="input_ghes_url"></a> [ghes\_url](#input\_ghes\_url) | GitHub Enterprise Server URL. DO NOT SET IF USING PUBLIC GITHUB | `string` | `null` | no |
129129
| <a name="input_github_app_parameters"></a> [github\_app\_parameters](#input\_github\_app\_parameters) | Parameter Store for GitHub App Parameters. | <pre>object({<br> key_base64 = map(string)<br> id = map(string)<br> })</pre> | n/a | yes |
130130
| <a name="input_idle_config"></a> [idle\_config](#input\_idle\_config) | List of time period that can be defined as cron expression to keep a minimum amount of runners active instead of scaling down to 0. By defining this list you can ensure that in time periods that match the cron expression within 5 seconds a runner is kept idle. | <pre>list(object({<br> cron = string<br> timeZone = string<br> idleCount = number<br> }))</pre> | `[]` | no |
131+
| <a name="input_instance_allocation_strategy"></a> [instance\_allocation\_strategy](#input\_instance\_allocation\_strategy) | The allocation strategy for spot instances. AWS recommends to use `capacity-optimized` however the AWS default is `lowest-price`. | `string` | `"lowest-price"` | no |
132+
| <a name="input_instance_max_spot_price"></a> [instance\_max\_spot\_price](#input\_instance\_max\_spot\_price) | Max price price for spot intances per hour. This variable will be passed to the create fleet as max spot price for the fleet. | `string` | `null` | no |
131133
| <a name="input_instance_profile_path"></a> [instance\_profile\_path](#input\_instance\_profile\_path) | The path that will be added to the instance\_profile, if not set the environment name will be used. | `string` | `null` | no |
134+
| <a name="input_instance_target_capacity_type"></a> [instance\_target\_capacity\_type](#input\_instance\_target\_capacity\_type) | Default lifecyle used runner instances, can be either `spot` or `on-demand`. | `string` | `"spot"` | no |
132135
| <a name="input_instance_type"></a> [instance\_type](#input\_instance\_type) | [DEPRECATED] See instance\_types. | `string` | `"m5.large"` | no |
133136
| <a name="input_instance_types"></a> [instance\_types](#input\_instance\_types) | List of instance types for the action runner. Defaults are based on runner\_os (amzn2 for linux and Windows Server Core for win). | `list(string)` | `null` | no |
134137
| <a name="input_key_name"></a> [key\_name](#input\_key\_name) | Key pair name | `string` | `null` | no |
@@ -142,7 +145,7 @@ No modules.
142145
| <a name="input_log_level"></a> [log\_level](#input\_log\_level) | Logging level for lambda logging. Valid values are 'silly', 'trace', 'debug', 'info', 'warn', 'error', 'fatal'. | `string` | `"info"` | no |
143146
| <a name="input_log_type"></a> [log\_type](#input\_log\_type) | Logging format for lambda logging. Valid values are 'json', 'pretty', 'hidden'. | `string` | `"pretty"` | no |
144147
| <a name="input_logging_retention_in_days"></a> [logging\_retention\_in\_days](#input\_logging\_retention\_in\_days) | Specifies the number of days you want to retain log events for the lambda log group. Possible values are: 0, 1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, and 3653. | `number` | `180` | no |
145-
| <a name="input_market_options"></a> [market\_options](#input\_market\_options) | Market options for the action runner instances. | `string` | `"spot"` | no |
148+
| <a name="input_market_options"></a> [market\_options](#input\_market\_options) | DEPCRECATED: Replaced by `instance_target_capacity_type`. | `string` | `null` | no |
146149
| <a name="input_metadata_options"></a> [metadata\_options](#input\_metadata\_options) | Metadata options for the ec2 runner instances. | `map(any)` | <pre>{<br> "http_endpoint": "enabled",<br> "http_put_response_hop_limit": 1,<br> "http_tokens": "optional"<br>}</pre> | no |
147150
| <a name="input_minimum_running_time_in_minutes"></a> [minimum\_running\_time\_in\_minutes](#input\_minimum\_running\_time\_in\_minutes) | The time an ec2 action runner should be running at minimum before terminated if non busy. If not set the default is calculated based on the OS. | `number` | `null` | no |
148151
| <a name="input_overrides"></a> [overrides](#input\_overrides) | This map provides the possibility to override some defaults. The following attributes are supported: `name_sg` overrides the `Name` tag for all security groups created by this module. `name_runner_agent_instance` overrides the `Name` tag for the ec2 instance defined in the auto launch configuration. `name_docker_machine_runners` overrides the `Name` tag spot instances created by the runner agent. | `map(string)` | <pre>{<br> "name_runner": "",<br> "name_sg": ""<br>}</pre> | no |

Diff for: modules/runners/lambdas/runners/jest.config.js

+7-7
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,13 @@ module.exports = {
22
preset: 'ts-jest',
33
testEnvironment: 'node',
44
collectCoverage: true,
5-
collectCoverageFrom: ['src/**/*.{ts,js,jsx}','!src/**/*local*.ts'],
5+
collectCoverageFrom: ['src/**/*.{ts,js,jsx}', '!src/**/*local*.ts', 'src/**/*.d.ts'],
66
coverageThreshold: {
77
global: {
8-
branches: 80,
9-
functions: 80,
10-
lines: 80,
11-
statements: 80
12-
}
13-
}
8+
branches: 92,
9+
functions: 92,
10+
lines: 92,
11+
statements: 92,
12+
},
13+
},
1414
};

0 commit comments

Comments
 (0)