Skip to content

Commit 9e3e4a0

Browse files
author
AWS
committed
Amazon SageMaker Service Update: Heterogeneous clusters: the ability to launch training jobs with multiple instance types. This enables running component of the training job on the instance type that is most suitable for it. e.g. doing data processing and augmentation on CPU instances and neural network training on GPU instances
1 parent d246a23 commit 9e3e4a0

File tree

2 files changed

+55
-6
lines changed

2 files changed

+55
-6
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"type": "feature",
3+
"category": "Amazon SageMaker Service",
4+
"contributor": "",
5+
"description": "Heterogeneous clusters: the ability to launch training jobs with multiple instance types. This enables running component of the training job on the instance type that is most suitable for it. e.g. doing data processing and augmentation on CPU instances and neural network training on GPU instances"
6+
}

services/sagemaker/src/main/resources/codegen-resources/service-2.json

+49-6
Original file line numberDiff line numberDiff line change
@@ -15131,6 +15131,45 @@
1513115131
"member":{"shape":"TrainingInputMode"},
1513215132
"min":1
1513315133
},
15134+
"InstanceGroup":{
15135+
"type":"structure",
15136+
"required":[
15137+
"InstanceType",
15138+
"InstanceCount",
15139+
"InstanceGroupName"
15140+
],
15141+
"members":{
15142+
"InstanceType":{
15143+
"shape":"TrainingInstanceType",
15144+
"documentation":"<p>Specifies the instance type of the instance group.</p>"
15145+
},
15146+
"InstanceCount":{
15147+
"shape":"TrainingInstanceCount",
15148+
"documentation":"<p>Specifies the number of instances of the instance group.</p>"
15149+
},
15150+
"InstanceGroupName":{
15151+
"shape":"InstanceGroupName",
15152+
"documentation":"<p>Specifies the name of the instance group.</p>"
15153+
}
15154+
},
15155+
"documentation":"<p>Defines an instance group for heterogeneous cluster training. When requesting a training job using the <a href=\"https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrainingJob.html\">CreateTrainingJob</a> API, you can configure up to 5 different ML training instance groups.</p>"
15156+
},
15157+
"InstanceGroupName":{
15158+
"type":"string",
15159+
"max":64,
15160+
"min":1,
15161+
"pattern":".+"
15162+
},
15163+
"InstanceGroupNames":{
15164+
"type":"list",
15165+
"member":{"shape":"InstanceGroupName"},
15166+
"max":5
15167+
},
15168+
"InstanceGroups":{
15169+
"type":"list",
15170+
"member":{"shape":"InstanceGroup"},
15171+
"max":5
15172+
},
1513415173
"InstanceMetadataServiceConfiguration":{
1513515174
"type":"structure",
1513615175
"required":["MinimumInstanceMetadataServiceVersion"],
@@ -23439,11 +23478,7 @@
2343923478
},
2344023479
"ResourceConfig":{
2344123480
"type":"structure",
23442-
"required":[
23443-
"InstanceType",
23444-
"InstanceCount",
23445-
"VolumeSizeInGB"
23446-
],
23481+
"required":["VolumeSizeInGB"],
2344723482
"members":{
2344823483
"InstanceType":{
2344923484
"shape":"TrainingInstanceType",
@@ -23460,6 +23495,10 @@
2346023495
"VolumeKmsKeyId":{
2346123496
"shape":"KmsKeyId",
2346223497
"documentation":"<p>The Amazon Web Services KMS key that SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s) that run the training job.</p> <note> <p>Certain Nitro-based instances include local storage, dependent on the instance type. Local storage volumes are encrypted using a hardware module on the instance. You can't request a <code>VolumeKmsKeyId</code> when using an instance type with local storage.</p> <p>For a list of instance types that support local instance storage, see <a href=\"https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#instance-store-volumes\">Instance Store Volumes</a>.</p> <p>For more information about local instance storage encryption, see <a href=\"https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html\">SSD Instance Store Volumes</a>.</p> </note> <p>The <code>VolumeKmsKeyId</code> can be in any of the following formats:</p> <ul> <li> <p>// KMS Key ID</p> <p> <code>\"1234abcd-12ab-34cd-56ef-1234567890ab\"</code> </p> </li> <li> <p>// Amazon Resource Name (ARN) of a KMS Key</p> <p> <code>\"arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab\"</code> </p> </li> </ul>"
23498+
},
23499+
"InstanceGroups":{
23500+
"shape":"InstanceGroups",
23501+
"documentation":"<p>The configuration of a heterogeneous cluster in JSON format.</p>"
2346323502
}
2346423503
},
2346523504
"documentation":"<p>Describes the resources, including ML compute instances and ML storage volumes, to use for model training. </p>"
@@ -23694,6 +23733,10 @@
2369423733
"AttributeNames":{
2369523734
"shape":"AttributeNames",
2369623735
"documentation":"<p>A list of one or more attribute names to use that are found in a specified augmented manifest file.</p>"
23736+
},
23737+
"InstanceGroupNames":{
23738+
"shape":"InstanceGroupNames",
23739+
"documentation":"<p>A list of names of instance groups that get data from the S3 data source.</p>"
2369723740
}
2369823741
},
2369923742
"documentation":"<p>Describes the S3 data source.</p>"
@@ -24996,7 +25039,7 @@
2499625039
},
2499725040
"TrainingInstanceCount":{
2499825041
"type":"integer",
24999-
"min":1
25042+
"min":0
2500025043
},
2500125044
"TrainingInstanceType":{
2500225045
"type":"string",

0 commit comments

Comments
 (0)