Skip to content

Commit e0ca252

Browse files
authored
feat(batch): set default spot allocation strategy to SPOT_PRICE_CAPACITY_OPTIMIZED (#26731)
https://aws.amazon.com/about-aws/whats-new/2023/08/aws-batch-price-capacity-optimized-allocation-strategy-spot-instances/ and https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-fleet-allocation-strategy.html `SPOT_PRICE_CAPACITY_OPTIMIZED` is now recommended over `SPOT_CAPACITY_OPTIMIZED`; make it the new default, while the construct is still in alpha. BREAKING CHANGE: if using spot instances on your Compute Environments, they will default to `SPOT_PRICE_CAPACITY_OPTIMIZED` instead of `SPOT_CAPACITY_OPTIMIZED` now. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
1 parent ce2f844 commit e0ca252

11 files changed

+487
-145
lines changed

packages/@aws-cdk/aws-batch-alpha/README.md

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -128,19 +128,23 @@ computeEnv.addInstanceClass(ec2.InstanceClass.R4);
128128

129129
#### Allocation Strategies
130130

131-
| Allocation Strategy | Optimized for | Downsides |
132-
| ----------------------- | ------------- | ----------------------------- |
133-
| BEST_FIT | Cost | May limit throughput |
134-
| BEST_FIT_PROGRESSIVE | Throughput | May increase cost |
135-
| SPOT_CAPACITY_OPTIMIZED | Least interruption | Only useful on Spot instances |
131+
| Allocation Strategy | Optimized for | Downsides |
132+
| ----------------------- | ------------- | ----------------------------- |
133+
| BEST_FIT | Cost | May limit throughput |
134+
| BEST_FIT_PROGRESSIVE | Throughput | May increase cost |
135+
| SPOT_CAPACITY_OPTIMIZED | Least interruption | Only useful on Spot instances |
136+
| SPOT_PRICE_CAPACITY_OPTIMIZED | Least interruption + Price | Only useful on Spot instances |
136137

137138
Batch provides different Allocation Strategies to help it choose which instances to provision.
138139
If your workflow tolerates interruptions, you should enable `spot` on your `ComputeEnvironment`
139-
and use `SPOT_CAPACITY_OPTIMIZED` (this is the default if `spot` is enabled).
140+
and use `SPOT_PRICE_CAPACITY_OPTIMIZED` (this is the default if `spot` is enabled).
140141
This will tell Batch to choose the instance types from the ones you’ve specified that have
141-
the most spot capacity available to minimize the chance of interruption.
142+
the most spot capacity available to minimize the chance of interruption and have the lowest price.
142143
To get the most benefit from your spot instances,
143144
you should allow Batch to choose from as many different instance types as possible.
145+
If you only care about minimal interruptions and not want Batch to optimize for cost, use
146+
`SPOT_CAPACITY_OPTIMIZED`. `SPOT_PRICE_CAPACITY_OPTIMIZED` is recommended over `SPOT_CAPACITY_OPTIMIZED`
147+
for most use cases.
144148

145149
If your workflow does not tolerate interruptions and you want to minimize your costs at the expense
146150
of potentially longer waiting times, use `AllocationStrategy.BEST_FIT`.
@@ -189,7 +193,8 @@ const computeEnv = new batch.ManagedEc2EcsComputeEnvironment(this, 'myEc2Compute
189193
You can specify the maximum and minimum vCPUs a managed `ComputeEnvironment` can have at any given time.
190194
Batch will *always* maintain `minvCpus` worth of instances in your ComputeEnvironment, even if it is not executing any jobs,
191195
and even if it is disabled. Batch will scale the instances up to `maxvCpus` worth of instances as
192-
jobs exit the JobQueue and enter the ComputeEnvironment. If you use `AllocationStrategy.BEST_FIT_PROGRESSIVE` or `AllocationStrategy.SPOT_CAPACITY_OPTIMIZED`,
196+
jobs exit the JobQueue and enter the ComputeEnvironment. If you use `AllocationStrategy.BEST_FIT_PROGRESSIVE`,
197+
`AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED`, or `AllocationStrategy.SPOT_CAPACITY_OPTIMIZED`,
193198
batch may exceed `maxvCpus`; it will never exceed `maxvCpus` by more than a single instance type. This example configures a
194199
`minvCpus` of 10 and a `maxvCpus` of 100:
195200

packages/@aws-cdk/aws-batch-alpha/lib/managed-compute-environment.ts

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -452,6 +452,15 @@ export enum AllocationStrategy {
452452
* you should allow Batch to choose from as many different instance types as possible.
453453
*/
454454
SPOT_CAPACITY_OPTIMIZED = 'SPOT_CAPACITY_OPTIMIZED',
455+
456+
/**
457+
* The price and capacity optimized allocation strategy looks at both price and capacity
458+
* to select the Spot Instance pools that are the least likely to be interrupted
459+
* and have the lowest possible price.
460+
*
461+
* The Batch team recommends this over `SPOT_CAPACITY_OPTIMIZED` in most instances.
462+
*/
463+
SPOT_PRICE_CAPACITY_OPTIMIZED = 'SPOT_PRICE_CAPACITY_OPTIMIZED',
455464
}
456465

457466
/**
@@ -1145,7 +1154,9 @@ function createSpotFleetRole(scope: Construct): IRole {
11451154
function determineAllocationStrategy(id: string, allocationStrategy?: AllocationStrategy, spot?: boolean): AllocationStrategy | undefined {
11461155
let result = allocationStrategy;
11471156
if (!allocationStrategy) {
1148-
result = spot ? AllocationStrategy.SPOT_CAPACITY_OPTIMIZED : AllocationStrategy.BEST_FIT_PROGRESSIVE;
1157+
result = spot ? AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED : AllocationStrategy.BEST_FIT_PROGRESSIVE;
1158+
} else if (allocationStrategy === AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED && !spot) {
1159+
throw new Error(`Managed ComputeEnvironment '${id}' specifies 'AllocationStrategy.SPOT_PRICE_CAPACITY_OPTIMIZED' without using spot instances`);
11491160
} else if (allocationStrategy === AllocationStrategy.SPOT_CAPACITY_OPTIMIZED && !spot) {
11501161
throw new Error(`Managed ComputeEnvironment '${id}' specifies 'AllocationStrategy.SPOT_CAPACITY_OPTIMIZED' without using spot instances`);
11511162
}

packages/@aws-cdk/aws-batch-alpha/test/integ.managed-compute-environment.js.snapshot/BatchManagedComputeEnvironmentTestDefaultTestDeployAssertD4528F80.assets.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
{
2-
"version": "32.0.0",
2+
"version": "33.0.0",
33
"files": {
44
"21fbb51d7b23f6a6c262b46a9caee79d744a3ac019fd45422d988b96d44b2a22": {
55
"source": {

packages/@aws-cdk/aws-batch-alpha/test/integ.managed-compute-environment.js.snapshot/batch-stack.assets.json

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
{
2-
"version": "32.0.0",
2+
"version": "33.0.0",
33
"files": {
4-
"81f3134124cef368d56ccabda586dbcbef39a78089edd14c9d641cbcb4e0bad2": {
4+
"c107f22b1a273d6b3e98ae47d04dfc2c17295a01e96b0b2a69ceaaad3ec33905": {
55
"source": {
66
"path": "batch-stack.template.json",
77
"packaging": "file"
88
},
99
"destinations": {
1010
"current_account-current_region": {
1111
"bucketName": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}",
12-
"objectKey": "81f3134124cef368d56ccabda586dbcbef39a78089edd14c9d641cbcb4e0bad2.json",
12+
"objectKey": "c107f22b1a273d6b3e98ae47d04dfc2c17295a01e96b0b2a69ceaaad3ec33905.json",
1313
"assumeRoleArn": "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/cdk-hnb659fds-file-publishing-role-${AWS::AccountId}-${AWS::Region}"
1414
}
1515
}

0 commit comments

Comments
 (0)