Skip to content

Commit c3069b3

Browse files
knikuregwang111Raymond LiuJohn BarbozaMalav Shastri
committed
feat: ModelBuilder for simplified model testing and deployment (#1266)
Co-authored-by: Gary Wang <[email protected]> Co-authored-by: Gary Wang <[email protected]> Co-authored-by: Raymond Liu <[email protected]> Co-authored-by: John Barboza <[email protected]> Co-authored-by: Malav Shastri <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Mike Schneider <[email protected]> Co-authored-by: Bhupendra Singh <[email protected]> Co-authored-by: ci <ci> Co-authored-by: Malav Shastri <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: Zuoyuan Huang <[email protected]> Co-authored-by: evakravi <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: Alexander Pivovarov <[email protected]> Co-authored-by: SSRraymond <[email protected]> Co-authored-by: Ruilian Gao <[email protected]> Co-authored-by: Ao Guo <[email protected]> Co-authored-by: qidewenwhen <[email protected]> Co-authored-by: mariumof <[email protected]> Co-authored-by: matherit <[email protected]> Co-authored-by: amzn-choeric <[email protected]> Co-authored-by: Ao Guo <[email protected]> Co-authored-by: Sally Seok <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Qingzi-Lan <[email protected]> Co-authored-by: Sally Seok <[email protected]> Co-authored-by: Manu Seth <[email protected]> Co-authored-by: Miyoung <[email protected]> Co-authored-by: Sarah Castillo <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: Xin Wang <[email protected]> Co-authored-by: stacicho <[email protected]> Co-authored-by: martinRenou <[email protected]> Co-authored-by: jiapinw <[email protected]> Co-authored-by: Akash Goel <[email protected]> Co-authored-by: Joseph Zhang <[email protected]> Co-authored-by: Harsha Reddy <[email protected]> Co-authored-by: Haixin Wang <[email protected]> Co-authored-by: Kalyani Nikure <[email protected]> Co-authored-by: Xin Wang <[email protected]> Co-authored-by: Gili Nachum <[email protected]> Co-authored-by: Jose Pena <[email protected]> Co-authored-by: cansun <[email protected]> Co-authored-by: AWS-pratab <[email protected]> Co-authored-by: shenlongtang <[email protected]> Co-authored-by: Zach Kimberg <[email protected]> Co-authored-by: chrivtho-github <[email protected]> Co-authored-by: Justin <[email protected]> Co-authored-by: Duc Trung Le <[email protected]> Co-authored-by: HappyAmazonian <[email protected]> Co-authored-by: cj-zhang <[email protected]> Co-authored-by: Matthew <[email protected]> Co-authored-by: Zach Kimberg <[email protected]> Co-authored-by: Rohith Nadimpally <[email protected]> Co-authored-by: rohithn1 <[email protected]> Co-authored-by: Victor Zhu <[email protected]> Co-authored-by: jbarz1 <[email protected]> Co-authored-by: Mohan Gandhi <[email protected]> Co-authored-by: Mohan Gandhi <[email protected]> Co-authored-by: Barboza <[email protected]> Co-authored-by: ruiliann666 <[email protected]> fixes (#963) fix: skip tensorflow local mode notebook test (#4060) Fix TorchTensorSer/Deser (#969) fix (#971) fix local container mode (#972) Fix auto detect (#979) Fix routing fn (#981) fix: tags for jumpstart model package models (#4061) fix: pipeline variable kms key (#4065) fix: jumpstart cache using sagemaker session s3 client (#4051) fix: gated models unsupported region (#4069) fix local container serialization (#989) fix custom serialiazation with local container. Also remove a lot of unused code (#994) Fix custom serialization for local container mode (#1000) fix pytorch version (#1001) Fix unit test (#990) Fix unit tests (#1018) Fix happy hf test (#1026) fix logic setup (#1034) fixes (#1045) Fix flake error in init (#1050) fix (#1053) fix: pipeline upsert failed to pass parallelism_config to update (#4066) fix: temporarily skip kmeans notebook (#4092) fixes (#1051) Fix missing absolute import error (#1057) Fix flake8 error in unit test (#1058) fixes (#1056) Fix flake8 error in integ test (#1060) Fix black format error in test_pickle_dependencies (#1062) Fix docstyle error under serve (#1065) Fix docstyle error in builder failure (#1066) fix black and flake8 formatting (#1069) Fix format error (#1070) Fix integ test (#1074) fix: HuggingFaceProcessor parameterized instance_type when image_uri is absent (#4072) fix: log message when sdk defaults not applied (#4104) fix: handle bad jumpstart default session (#4109) Fix the version information, whl and flake8 (#1085) Fix JSON serializer error (#1088) Fix unit test (#1091) fix format (#1103) Fix local mode predictor (#1107) Fix DJLPredictor (#1108) Fix modelbuilder unit tests (#1118) fixes (#1136) fixes (#1165) fixes (#1166) fix: auto ml integ tests and add flaky test markers (#4136) fix model data for JumpStartModel (#4135) fix: transform step unit test (#4151) fix: Update pipeline.py and selective_execution_config.py with small fixes (#1099) fix: Fixed bug in _create_training_details (#4141) fix: use correct line endings and s3 uris on windows (#4118) fix: js tagging s3 prefix (#4167) fix: Update Ec2 instance type to g5.4xlarge in test_huggingface_torch_distributed.py (#4181) fix: import error in unsupported js regions (#4188) fix: update local mode schema (#4185) fix: fix flaky Inference Recommender integration tests (#4156) fix: clone distribution in validate_distribution (#4205) Fix hyperlinks in feature_processor.scheduler parameter descriptions (#4208) Fix master merge formatting (#1186) Fix master unit tests (#1203) Fix djl unit tests (#1204) Fix merge conflicts (#1217) fix: fix URL links (#4217) fix: bump urllib3 version (#4223) fix: relax upper bound on urllib in local mode requirements (#4219) fixes (#1224) fix formatting (#1233) fix byoc unit tests (#1235) fix byoc unit tests (#1236)
1 parent ef8dd31 commit c3069b3

File tree

139 files changed

+12212
-25
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

139 files changed

+12212
-25
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,4 @@ env/
3434
**/_repack_script_launcher.sh
3535
tests/data/**/_repack_model.py
3636
tests/data/experiment/sagemaker-dev-1.0.tar.gz
37+
src/sagemaker/serve/tmp_workspace

CHANGELOG.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -5955,4 +5955,4 @@
59555955

59565956
## 1.0.0
59575957

5958-
* Initial commit
5958+
* Initial commit

MANIFEST.in

+1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
recursive-include src/sagemaker *.py
22

33
include src/sagemaker/image_uri_config/*.json
4+
include src/sagemaker/serve/requirements.txt
45
recursive-include requirements *
56

67
include VERSION

doc/api/inference/model_builder.rst

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
Model Builder
2+
-------------
3+
4+
This module contains classes related to Amazon Sagemaker Model Builder
5+
6+
.. autoclass:: sagemaker.serve.builder.model_builder.ModelBuilder
7+
8+
.. automethod:: sagemaker.serve.builder.model_builder.ModelBuilder.build
9+
10+
.. automethod:: sagemaker.serve.builder.model_builder.ModelBuilder.save
11+
12+
.. autoclass:: sagemaker.serve.spec.inference_spec.InferenceSpec
13+
14+
.. autoclass:: sagemaker.serve.builder.schema_builder.SchemaBuilder
15+
16+
.. autoclass:: sagemaker.serve.marshalling.custom_payload_translator.CustomPayloadTranslator

doc/overview.rst

+114
Original file line numberDiff line numberDiff line change
@@ -820,6 +820,120 @@ the
820820
821821
predictor.predict("this is the best day of my life", {"ContentType""application/x-text"})
822822
823+
Deploy a pre-trained model using the SageMaker ModelBuilder class
824+
-----------------------------------------------------------------
825+
If you prefer a streamlined solution to build and deploy your model, SageMaker Python SDK offers additional APIs that apply intelligent defaults to help you create a SageMaker-deployable model in fewer steps. ``ModelBuilder`` simplifies model creation by performing the following tasks for you:
826+
827+
- Converts machine learning models trained using various frameworks like XgBoost or PyTorch into SageMaker-deployable models with a single line of code.
828+
- Performs automatic container selection based on the model framework, so you don’t have to manually specify your container. You can still bring your own container by passing your own URI to `ModelBuilder`.
829+
- Handles the serializing of data on the client side before sending it to the server for inference, and deserializing the results returned by the server. Data is correctly formatted without manual processing.
830+
- Automated capture of dependencies, libraries and packages needed by your model. You don’t have to package the dependencies and upload to S3. The deployment environment matches the development environment to ensure a smooth transition from development to deployment.
831+
832+
Build your model with ModelBuilder
833+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
834+
835+
The ``ModelBuilder`` class can take your framework (Xgboost or Pytorch) model and convert it into a SageMaker-deployable model, generating the artifacts according the model server. If you don’t want to supply a model directly, you can provide inference code to specify a model source - this method is discussed in the following sections. There are many available options for `ModelBuilder`, but if your model doesn’t require extensive customization and you want to deploy immediately, you can supply at minimum a framework model, input, and output. If you did not set up a default role ARN, you need to provide it as well. To view all options offered in `ModelBuilder`, see the `ModelBuilder documentation
836+
<https://sagemaker.readthedocs.io/en/stable/api/inference/model_builder.html>`_.
837+
838+
In the following code example, ``ModelBuilder`` is called with a framework model and an instance of ``SchemaBuilder`` with minimum arguments (to infer the corresponding functions for serializing and deserializing the endpoint input and output). No container is specified and no packaged dependencies are passed - SageMaker saves you preparation time and effort by automatically inferring these resources when you build your model.
839+
840+
.. code:: python
841+
842+
model_builder = ModelBuilder(
843+
model=model,
844+
schema_builder=SchemaBuilder(X_test, y_pred),
845+
)
846+
847+
model_builder.build()
848+
849+
If you want to bring your own container, you can also specify the image URI and set the mode argument to ``Mode.LOCAL_CONTAINER`` as shown in the following example. When you want to deploy to SageMaker, change the argument to ``Mode.SAGEMAKER_ENDPOINT``.
850+
851+
.. code:: python
852+
853+
model_builder = ModelBuilder(
854+
model=model,
855+
model_server=ModelServer.TORCHSERVE,
856+
schema_builder=SchemaBuilder(X_test, y_pred),
857+
image_uri="12345678910.dkr.ecr.ap-southeast-2.amazonaws.com/byoc-image:xgb-1.7-1")
858+
)
859+
860+
Define serialization and deserialization methods with SchemaBuilder
861+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
862+
863+
``SchemaBuilder`` accepts sample input and output and can infer corresponding functions for serializing and deserializing the endpoint input and output. For example, the following example would work:
864+
865+
.. code:: python
866+
867+
input = "How is the demo going?"
868+
output = "Comment la démo va-t-elle?"
869+
schema = SchemaBuilder(input, output)
870+
871+
However, you might want to further customize your serialization and deserialization functions. For example, you might want to pass an image and want your serializer to converts the image to a tensor before prediction. You can define this translation in ``CustomPayloadTranslator`` for input, output, or both, and pass them to ``SchemaBuilder``. For an example that creates custom input and output translators with ``CustomPayloadTranslator``, see `ModelBuilder examples
872+
<https://github.com/aws-samples/sagemaker-hosting/SageMaker-Model-Builder>`_.
873+
874+
.. code:: python
875+
876+
class SchemaBuilder(
877+
sample_input: Any,
878+
sample_output: Any,
879+
input_translator: CustomPayloadTranslator = None,
880+
output_translator: CustomPayloadTranslator = None
881+
)
882+
883+
You can use ``SchemaBuilder`` to standardize the serialization and deserialization functions for endpoint input and output for a model server. Insert those functions into a one definition of ``SchemaBuilder`` and pass this definition to all instances of ``ModelBuilder`` for the same model server. As a result, you no longer have to maintain common implementation details in individualized scripts for each model.
884+
885+
Load the model with a custom function using InferenceSpec
886+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
887+
888+
As previously mentioned, you dont have to supply a framework model directly to ``ModelBuilder``. You can instead pass an instance of ``InferenceSpec`` with the load and invoke functions defined. The load function contains the custom logic that creates the model, and invoke instructs SageMaker how to pass the input payload to the endpoint.
889+
890+
The following example uses ``InferenceSpec`` to generate a model with the HuggingFace pipeline.
891+
892+
.. code:: python
893+
894+
from sagemaker.serve.spec.inference_spec import InferenceSpec
895+
from transformers import pipeline
896+
897+
class MyInferenceSpec(InferenceSpec):
898+
def load(self, model_dir: str):
899+
return pipeline("translation_en_to_fr", model="t5-small")
900+
901+
def invoke(self, input, model):
902+
return model(input)
903+
904+
inf_spec = MyInferenceSpec()
905+
906+
model_builder = ModelBuilder(
907+
inference_spec=my_inference_spec,
908+
schema_builder=SchemaBuilder(X_test, y_pred)
909+
)
910+
911+
For sample notebooks that demonstrate the use of ``InferenceSpec``, see `ModelBuilder
912+
examples <https://github.com/aws-samples/sagemaker-hosting/SageMaker-Model-Builder>`_.
913+
914+
Build your model and deploy
915+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
916+
917+
Call the build() function to create your deployable model. This step creates an inference.py in your working directory with the code necessary to create your schema, run serialization and deserialization of inputs and outputs, and perform other user-specified custom logic.
918+
919+
.. code:: python
920+
921+
# Build the model according to the model server specification and save it to as files in the working directory
922+
model = model_builder.build()
923+
924+
Deploy your model with the model’s existing ``deploy()`` method. A model constructed from ``ModelBuilder`` enables live logging during deployment as an added feature.
925+
926+
.. code:: python
927+
928+
predictor = model.deploy(
929+
initial_instance_count=1,
930+
instance_type="ml.c6i.xlarge"
931+
)
932+
933+
For more examples of using ``ModelBuilder`` to build your models, see
934+
`ModelBuilder sample notebooks <https://github.com/aws-samples/sagemaker-hosting/SageMaker-Model-Builder>`_.
935+
936+
823937
Fine-tune a Model and Deploy to a SageMaker Endpoint
824938
====================================================
825939

inference-experience-dev-tester.sh

+53
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
trap "exit" INT
2+
3+
eval "$(conda shell.zsh hook)"
4+
5+
conda activate $1
6+
7+
echo Installing tox if necessary
8+
pip install --upgrade tox
9+
10+
echo Running unit testing in Python3.8 Conda Environement
11+
12+
if "$4";
13+
then
14+
echo Clearing .tox cache for Python3.8
15+
rm -r .tox/py38
16+
fi
17+
18+
tox -e py38 -- tests/unit/sagemaker/serve/.
19+
20+
if "$3";
21+
then
22+
echo Running Python3.8 Integration Tests
23+
tox -e py38 -- tests/integ/sagemaker/serve/.
24+
fi
25+
26+
conda deactivate
27+
28+
conda activate $2
29+
30+
echo Installing tox if necessary
31+
pip install --upgrade tox
32+
33+
echo Running unit testing in Python3.10 Conda Environment
34+
35+
if "$4";
36+
then
37+
echo Clearing .tox cache for Python3.10
38+
rm -r .tox/py10
39+
fi
40+
41+
tox -e py310 -- tests/unit/sagemaker/serve/.
42+
43+
if "$3";
44+
then
45+
echo Running Python3.10 Integration Tests
46+
tox -e py310 -- tests/integ/sagemaker/serve/.
47+
fi
48+
49+
conda deactivate
50+
51+
echo Coverage report after testing:
52+
53+
coverage report -i --fail-under=75 --include "*/serve/*" --omit '*in_process*,*interceptors*,*__init__*,*build_model*,*function_pointers*'

requirements/extras/test_requirements.txt

+9
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,12 @@ docker>=5.0.2,<7.0.0
2929
PyYAML==6.0
3030
pyspark==3.3.1
3131
sagemaker-feature-store-pyspark-3.3
32+
# TODO find workaround
33+
xgboost>=1.6.2,<=1.7.6
34+
pillow>=9.5.0,<=10.0.0
35+
torch@https://download.pytorch.org/whl/cpu/torch-2.0.0%2Bcpu-cp310-cp310-linux_x86_64.whl
36+
torchvision@https://download.pytorch.org/whl/cpu/torchvision-0.15.1%2Bcpu-cp310-cp310-linux_x86_64.whl
37+
transformers==4.32.0
38+
sentencepiece==0.1.99
39+
# https://github.com/triton-inference-server/server/issues/6246
40+
tritonclient[http]<2.37.0

setup.py

+25-1
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515

1616
import os
1717
from glob import glob
18+
import sys
1819

1920
from setuptools import find_packages, setup
2021

@@ -63,6 +64,12 @@ def read_requirements(filename):
6364
"jsonschema",
6465
"platformdirs",
6566
"tblib==1.7.0",
67+
"urllib3<1.27",
68+
"uvicorn==0.22.0",
69+
"fastapi==0.95.2",
70+
"requests",
71+
"docker",
72+
"tqdm",
6673
]
6774

6875
# Specific use case dependencies
@@ -77,7 +84,21 @@ def read_requirements(filename):
7784
# Meta dependency groups
7885
extras["all"] = [item for group in extras.values() for item in group]
7986
# Tests specific dependencies (do not need to be included in 'all')
80-
extras["test"] = (read_requirements("requirements/extras/test_requirements.txt"),)
87+
test_dependencies = read_requirements("requirements/extras/test_requirements.txt")
88+
# remove torch and torchvision if python version is not 3.10
89+
if sys.version_info.minor != 10:
90+
test_dependencies = [
91+
module
92+
for module in test_dependencies
93+
if not (
94+
module.startswith("transformers")
95+
or module.startswith("sentencepiece")
96+
or module.startswith("torch")
97+
or module.startswith("torchvision")
98+
)
99+
]
100+
101+
extras["test"] = (test_dependencies,)
81102

82103
setup(
83104
name="sagemaker",
@@ -110,4 +131,7 @@ def read_requirements(filename):
110131
"sagemaker-upgrade-v2=sagemaker.cli.compatibility.v2.sagemaker_upgrade_v2:main",
111132
]
112133
},
134+
scripts=[
135+
"src/sagemaker/serve/model_server/triton/pack_conda_env.sh",
136+
],
113137
)

src/sagemaker/base_deserializers.py

+79
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222

2323
import numpy as np
2424
from six import with_metaclass
25+
import cloudpickle
2526

2627
from sagemaker.utils import DeferredError
2728

@@ -332,3 +333,81 @@ def deserialize(self, stream, content_type):
332333
return [json.loads(line) for line in lines]
333334
finally:
334335
stream.close()
336+
337+
338+
class TorchTensorDeserializer(SimpleBaseDeserializer):
339+
"""Deserialize stream to torch.Tensor.
340+
341+
Args:
342+
stream (botocore.response.StreamingBody): Data to be deserialized.
343+
content_type (str): The MIME type of the data.
344+
345+
Returns:
346+
torch.Tensor: The data deserialized into a torch Tensor.
347+
"""
348+
349+
def __init__(self, accept="tensor/pt"):
350+
super(TorchTensorDeserializer, self).__init__(accept=accept)
351+
self.numpy_deserializer = NumpyDeserializer()
352+
try:
353+
from torch import from_numpy
354+
355+
self.convert_npy_to_tensor = from_numpy
356+
except ImportError:
357+
raise Exception("Unable to import pytorch.")
358+
359+
def deserialize(self, stream, content_type="tensor/pt"):
360+
"""Deserialize streamed data to TorchTensor
361+
362+
See https://pytorch.org/docs/stable/generated/torch.from_numpy.html
363+
364+
Args:
365+
stream (botocore.response.StreamingBody): Data to be deserialized.
366+
content_type (str): The MIME type of the data.
367+
368+
Returns:
369+
list: A list of TorchTensor serializable objects.
370+
"""
371+
try:
372+
numpy_array = self.numpy_deserializer.deserialize(
373+
stream=stream, content_type="application/x-npy"
374+
)
375+
return self.convert_npy_to_tensor(numpy_array)
376+
except Exception:
377+
raise ValueError(
378+
"Unable to deserialize your data to torch.Tensor.\
379+
Please provide custom deserializer in InferenceSpec."
380+
)
381+
382+
383+
class PickleDeserializer(SimpleBaseDeserializer):
384+
"""Deserialize stream to object using cloudpickle module.
385+
386+
Args:
387+
stream (botocore.response.StreamingBody): Data to be deserialized.
388+
content_type (str): The MIME type of the data.
389+
390+
Returns:
391+
object: The data deserialized into a torch Tensor.
392+
"""
393+
394+
def __init__(self, accept="application/x-pkl"):
395+
super(PickleDeserializer, self).__init__(accept)
396+
397+
def deserialize(self, stream, content_type="application/x-pkl"):
398+
"""Deserialize pickle data from an inference endpoint.
399+
400+
Args:
401+
stream (botocore.response.StreamingBody): Data to be deserialized.
402+
content_type (str): The MIME type of the data.
403+
404+
Returns:
405+
list: A list of piclke serializable objects.
406+
"""
407+
try:
408+
return cloudpickle.loads(stream.read())
409+
except Exception:
410+
raise ValueError(
411+
"Cannot deserialize bytes to object with cloudpickle.\
412+
Please provide custom deserializer."
413+
)

0 commit comments

Comments
 (0)