feature: Adds support for Serverless inference #2831

bhaoz · 2022-01-06T19:13:02Z

Issue #, if available:

Description of changes:
Add support for Liberty feature aka serverless inference:

Add sagemaker.serverless.serverless_inference_config as the configuration class for serverless inference
Add support in sagemaker.estimator, sagemaker.model, sagemaker.tensorflow.model to let users deploy serverless endpoint by passing the configuration object.
3.Add support in sagemaker.session to generate production variants with serverless config and also create endpoints using serverless configurations.

Detailed Design is in this doc

Testing done:

Test the walk through behavior for deploying serverless endpoints and invoking serverless endpoints in integration test:test/integ/test_serverless_inference
Add unit tests to ensure backward compatible and ensure the behaviors of newly added methods.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

I have read the CONTRIBUTING doc
I certify that the changes I am introducing will be backword compatible, and I have discussed concerns about this, if any, with the Python SDK team
I used the commit message format described in CONTRIBUTING
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

I have added tests that prove my fix is effective or that my feature works (if appropriate)
I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
I have checked that my tests are not configured for a specific region or account (if appropriate)
I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

codecov-commenter · 2022-01-06T19:25:59Z

Codecov Report

Merging #2831 (c981604) into dev (127c964) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##              dev    #2831      +/-   ##
==========================================
+ Coverage   88.85%   88.87%   +0.01%     
==========================================
  Files         175      176       +1     
  Lines       15517    15540      +23     
==========================================
+ Hits        13788    13811      +23     
  Misses       1729     1729

Impacted Files	Coverage Δ
src/sagemaker/tensorflow/model.py	`89.13% <ø> (ø)`
src/sagemaker/estimator.py	`90.61% <100.00%> (+0.01%)`	⬆️
src/sagemaker/model.py	`89.05% <100.00%> (+0.27%)`	⬆️
...agemaker/serverless/serverless_inference_config.py	`100.00% <100.00%> (ø)`
src/sagemaker/session.py	`70.72% <100.00%> (+0.11%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 127c964...c981604. Read the comment docs.

sagemaker-bot · 2022-01-06T19:31:22Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: ad3dcc0
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-06T20:12:21Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: ad3dcc0
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-06T20:23:01Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: ad3dcc0
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

bhaoz · 2022-01-06T22:32:34Z

Can I know why the slow-tests are failed? I cannot get much details from the log it posted. Thanks!

rishrayc · 2022-01-08T00:13:14Z

src/sagemaker/estimator.py

-
+        Raises:
+             ValueError: If serverless inference config is not specified and instance type
+                and instance count are also not specified


why do we need this?

Cause the way we’re doing this, is to let customers pass an empty config object to let us know it’s a serverless endpoint to be deployed. We only default values if they pass an empty object. If they just leave that arg as None, we’ll not using serverless endpoint but normal endpoint. So that means, we need to check to ensure instance_type/instance_count and serverless_inference_config cannot be None in the same time.

So we will throw this error when both serverless config and instance count are also none?

src/sagemaker/estimator.py

src/sagemaker/model.py

mufaddal-rohawala · 2022-01-10T20:49:28Z

src/sagemaker/model.py

+            if self._is_compiled_model and not is_serverless:
+                compiled_model_suffix = "-".join(instance_type.split(".")[:-1])


This code is already used above, can one of these this be made redundant?

src/sagemaker/tensorflow/model.py

mufaddal-rohawala · 2022-01-10T20:59:20Z

tests/unit/test_estimator.py

+
+    e.fit()
+
+    bad_args = ({"instance_type": INSTANCE_TYPE}, {"initial_instance_count": INSTANCE_COUNT})


Can we also add to this list, scenerio when instance_type=None and initial_instance_count=None

sagemaker-bot · 2022-01-10T21:30:38Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: ad3dcc0
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-10T23:43:12Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: 1bd2a75
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-11T00:23:07Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: 1bd2a75
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-11T00:27:22Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-pr
Commit ID: 1bd2a75
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-11T00:37:52Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: 1bd2a75
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-11T00:47:39Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-notebook-tests
Commit ID: 1bd2a75
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

rishrayc · 2022-01-08T00:13:46Z

src/sagemaker/model.py

@@ -209,7 +209,7 @@ def register(
            model_package_arn=model_package.get("ModelPackageArn"),
        )

-    def _init_sagemaker_session_if_does_not_exist(self, instance_type):
+    def _init_sagemaker_session_if_does_not_exist(self, instance_type=None):


what does this do?

Since serverless will need instance_type to be None, so add this to support serverless case.

rishrayc · 2022-01-08T00:15:47Z

src/sagemaker/model.py


-        if instance_type.startswith("ml.inf") and not self._is_compiled_model:
+        if instance_type and instance_type.startswith("ml.inf") and not self._is_compiled_model:


what is this for?

Same as above

rishrayc · 2022-01-08T00:16:34Z

src/sagemaker/model.py

+            self.name,
+            instance_type,
+            initial_instance_count,
+            accelerator_type=accelerator_type,


We are not supporting accelerators for serverless, will the below code impact in anyway?

This will be validated on the low-level botocore/boto3 libs, so it should be fine

rishrayc · 2022-01-08T00:17:00Z

src/sagemaker/serverless/serverless_inference_config.py

+            memory_size_in_mb (int): Optional. The memory size of your serverless endpoint.
+                Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB,
+                5120 MB, or 6144 MB. If no value is provided, Amazon SageMaker will choose
+                the default value for you. (Default: 2048)


Default is 1 GB, please check with Michael once

BOTO3 API there is no default for memory and maxconcurrency shown in the Boto3 API. I can double check with Michael. The reason why we set default here in Sagemaker SDK is to simpler customer use case. They don't need to specify this config and we'll have a proper value for them. And on the other hand, if they want to set those, they can choose the config they prefer.

Yep, the default is controlled by the SageMaker SDK here - there's no inherent default in the CreateEndpointConfig API itself.

rishrayc · 2022-01-08T00:17:41Z

src/sagemaker/serverless/serverless_inference_config.py

+                Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB,
+                5120 MB, or 6144 MB. If no value is provided, Amazon SageMaker will choose
+                the default value for you. (Default: 2048)
+            max_concurrency (int): Optional. The maximum number of concurrent invocations


Is there a default maxconcurrency with Boto3 today? good to double check with Michael

Not with boto3.

src/sagemaker/estimator.py

rishrayc · 2022-01-10T23:14:19Z

src/sagemaker/estimator.py

+                Specifies configuration related to serverless endpoint. Use this configuration
+                when trying to create serverless endpoint and make serverless inference. If
+                empty config object passed through, we will use default config to deploy
+                serverless endpoint (default: None)


The language here is unclear " we will use default config to deploy serverless endpoint (default: None)". If we will default config, default should not be none right?

Changed this one and the one in model.py to a clearer expression

rishrayc · 2022-01-10T23:15:31Z

src/sagemaker/estimator.py

-
+        Raises:
+             ValueError: If serverless inference config is not specified and instance type
+                and instance count are also not specified


So we will throw this error when both serverless config and instance count are also none?

rishrayc · 2022-01-10T23:17:58Z

src/sagemaker/model.py

+                Specifies configuration related to serverless endpoint. Use this configuration
+                when trying to create serverless endpoint and make serverless inference. If
+                empty config object passed through, we will use default config to deploy
+                serverless endpoint (default: None)


Same as before, the language is not clear here

rishrayc · 2022-01-10T23:18:22Z

src/sagemaker/model.py

+                serverless endpoint (default: None)
+        Raises:
+             ValueError: If no role is specified or if serverless inference config is not
+                specified and instance type and instance count are also not specified


What is the error message that customers will see?

sagemaker-bot · 2022-01-11T01:44:55Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-unit-tests
Commit ID: c981604
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-11T02:27:07Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-local-mode-tests
Commit ID: c981604
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-01-11T02:34:05Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-python-sdk-slow-tests
Commit ID: c981604
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

* feature: allow conditional parellel builds (#2727) * fix endpoint bug (#2772) Co-authored-by: Basil Beirouti <[email protected]> * fix: local mode - support relative file structure (#2768) * prepare release v2.72.0 * update development version to v2.72.1.dev0 * fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790) * fix: Prevent repack_model script from referencing nonexistent directories (#2755) Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> * fix: S3Input - add support for instance attributes (#2754) * fix: typos and broken link (#2765) Co-authored-by: Shreya Pandit <[email protected]> * prepare release v2.72.1 * update development version to v2.72.2.dev0 * fix: Model Registration with BYO scripts (#2797) Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Shreya Pandit <[email protected]> * fix: Add ContentType in test_auto_ml_describe * fix: Re-deploy static integ test endpoint if it is not found * documentation :SageMaker model parallel library 1.6.0 API doc (#2814) * update smdmp change log, archive api doc for 1.4.0 and 1.5.0 * add no-index flags * finish api doc archive * fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790) * fix: Prevent repack_model script from referencing nonexistent directories (#2755) Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> * fix: S3Input - add support for instance attributes (#2754) * fix: typos and broken link (#2765) Co-authored-by: Shreya Pandit <[email protected]> * add all api docs * add appendix, fix links * structural changes, fix links * incorporate feedback * prepare release v2.72.1 * update development version to v2.72.2.dev0 Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Mohamed Ali Jamaoui <[email protected]> Co-authored-by: Shreya Pandit <[email protected]> Co-authored-by: ci <ci> Co-authored-by: Jeniya Tabassum <[email protected]> * fix: fix kmeans test deletion sequence, increment lineage statics (#2815) * fix: Increment static lineage pipeline (#2817) * fix: Update CHANGELOG.md (#2832) * prepare release v2.72.2 * update development version to v2.72.3.dev0 * change: update master from dev (#2836) Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Shreya Pandit <[email protected]> Co-authored-by: Mohamed Ali Jamaoui <[email protected]> Co-authored-by: ci <ci> Co-authored-by: Jeniya Tabassum <[email protected]> Co-authored-by: sreedes <[email protected]> Co-authored-by: Navin Soni <[email protected]> Co-authored-by: Miyoung <[email protected]> Co-authored-by: Ameen Khan <[email protected]> Co-authored-by: Zhankui Lu <[email protected]> Co-authored-by: Xiaoguang Chen <[email protected]> Co-authored-by: Jonathan Guinegagne <[email protected]> Co-authored-by: Zhankui Lu <[email protected]> Co-authored-by: Yifei Zhu <[email protected]> Co-authored-by: Qingzi-Lan <[email protected]> * prepare release v2.72.3 * update development version to v2.72.4.dev0 * fix: fixes unnecessary session call while generating pipeline definition for lambda step (#2824) * feature: Add models_v2 under lineage context (#2800) * feature: enable python 3.9 (#2802) Co-authored-by: Ahsan Khan <[email protected]> * change: Update CHANGELOG.md (#2842) * fix: update pricing link (#2805) Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> Co-authored-by: Shreya Pandit <[email protected]> Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Mohamed Ali Jamaoui <[email protected]> Co-authored-by: ci <ci> Co-authored-by: Jeniya Tabassum <[email protected]> Co-authored-by: sreedes <[email protected]> Co-authored-by: Navin Soni <[email protected]> Co-authored-by: Miyoung <[email protected]> Co-authored-by: Ameen Khan <[email protected]> Co-authored-by: Zhankui Lu <[email protected]> Co-authored-by: Navin Soni <[email protected]> Co-authored-by: Xiaoguang Chen <[email protected]> Co-authored-by: Jonathan Guinegagne <[email protected]> Co-authored-by: Zhankui Lu <[email protected]> Co-authored-by: Yifei Zhu <[email protected]> Co-authored-by: Qingzi-Lan <[email protected]> * doc: Document the available ExecutionVariables (#2807) * fix: Remove duplicate vertex/edge in query lineage (#2784) * feature: Support model pipelines in CreateModelStep (#2845) Co-authored-by: Payton Staub <[email protected]> * feature: support JsonGet/Join parameterization in tuning step Hyperparameters (#2833) * doc: Enhance smddp 1.2.2 doc (#2852) * feature: support checkpoint to be passed from estimator (#2849) Co-authored-by: marckarp <[email protected]> * fix: allow kms_key to be passed for processing step (#2779) * feature: Adds support for Serverless inference (#2831) * feature: Add support for SageMaker lineage queries in action (#2853) * feature: Adds Lineage queries in artifact, context and trial components (#2838) * feature: Add EMRStep support in Sagemaker pipeline (#2848) Co-authored-by: chenxy <[email protected]> * prepare release v2.73.0 * update development version to v2.73.1.dev0 * feature: Add support for SageMaker lineage queries context (#2830) * fix: support specifying a facet by its column index Currently the Clarify BiasConfig only accepts facet name. Actually Clarify analysis configuration supports both name and index. This commit adds the same support to BiasConfig. * doc: more documentation for serverless inference (#2859) * prepare release v2.74.0 * update development version to v2.74.1.dev0 * Add deprecation warning in Clarify DataConfig (#2847) * feature: Update instance types for integ test (#2881) * feature: Adds support for async inference (#2846) * fix: update to incorporate black v22, pin tox versions (#2889) Co-authored-by: Mufaddal Rohawala <[email protected]> * make black happy Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: Basil Beirouti <[email protected]> Co-authored-by: ci <ci> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> Co-authored-by: Mohamed Ali Jamaoui <[email protected]> Co-authored-by: Shreya Pandit <[email protected]> Co-authored-by: sreedes <[email protected]> Co-authored-by: Navin Soni <[email protected]> Co-authored-by: Miyoung <[email protected]> Co-authored-by: Jeniya Tabassum <[email protected]> Co-authored-by: Ameen Khan <[email protected]> Co-authored-by: Zhankui Lu <[email protected]> Co-authored-by: Xiaoguang Chen <[email protected]> Co-authored-by: Jonathan Guinegagne <[email protected]> Co-authored-by: Zhankui Lu <[email protected]> Co-authored-by: Yifei Zhu <[email protected]> Co-authored-by: Qingzi-Lan <[email protected]> Co-authored-by: Xinghan Chen <[email protected]> Co-authored-by: Navin Soni <[email protected]> Co-authored-by: Tulio Casagrande <[email protected]> Co-authored-by: jerrypeng7773 <[email protected]> Co-authored-by: marckarp <[email protected]> Co-authored-by: marckarp <[email protected]> Co-authored-by: jayatalr <[email protected]> Co-authored-by: bhaoz <[email protected]> Co-authored-by: Ethan Cheng <[email protected]> Co-authored-by: chenxy <[email protected]> Co-authored-by: Xiaoguang Chen <[email protected]> Co-authored-by: keerthanvasist <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Shreya Pandit <[email protected]>

rishrayc reviewed Jan 8, 2022

View reviewed changes

mufaddal-rohawala requested changes Jan 10, 2022

View reviewed changes

shreyapandit force-pushed the dev branch from 1c039c8 to 7206b9e Compare January 10, 2022 23:05

bhaoz requested a review from mufaddal-rohawala January 10, 2022 23:38

rishrayc reviewed Jan 11, 2022

View reviewed changes

feature: add support for serverless inference

c981604

mufaddal-rohawala approved these changes Jan 11, 2022

View reviewed changes

shreyapandit approved these changes Jan 13, 2022

View reviewed changes

fm1ch4 approved these changes Jan 14, 2022

View reviewed changes

shreyapandit changed the title ~~feature: add support for serverless inference~~ feature: Adds support for Serverless inference Jan 14, 2022

shreyapandit merged commit 9d259b3 into aws:dev Jan 14, 2022

shreyapandit pushed a commit that referenced this pull request Jan 18, 2022

feature: Adds support for Serverless inference (#2831)

81af309

ahsan-z-khan pushed a commit that referenced this pull request Jan 18, 2022

feature: Adds support for Serverless inference (#2831)

4efbe84

		if self._is_compiled_model and not is_serverless:
		compiled_model_suffix = "-".join(instance_type.split(".")[:-1])


		e.fit()

		bad_args = ({"instance_type": INSTANCE_TYPE}, {"initial_instance_count": INSTANCE_COUNT})


		if instance_type.startswith("ml.inf") and not self._is_compiled_model:
		if instance_type and instance_type.startswith("ml.inf") and not self._is_compiled_model:

feature: Adds support for Serverless inference #2831

feature: Adds support for Serverless inference #2831

Conversation

bhaoz commented Jan 6, 2022 • edited Loading

Merge Checklist

General

Tests

codecov-commenter commented Jan 6, 2022 • edited Loading

Codecov Report

sagemaker-bot commented Jan 6, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 6, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 6, 2022

AWS CodeBuild CI Report

bhaoz commented Jan 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sagemaker-bot commented Jan 10, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 10, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 11, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 11, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 11, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 11, 2022

AWS CodeBuild CI Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sagemaker-bot commented Jan 11, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 11, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Jan 11, 2022

AWS CodeBuild CI Report

bhaoz commented Jan 6, 2022 •

edited

Loading

codecov-commenter commented Jan 6, 2022 •

edited

Loading