aws
diff --git a/‎.gitignore
+1 b/‎.gitignore
+1
diff --git a/‎CHANGELOG.md
+1-1 b/‎CHANGELOG.md
+1-1
diff --git a/‎MANIFEST.in
+1 b/‎MANIFEST.in
+1
diff --git a/‎doc/api/inference/model_builder.rst
+16 b/‎doc/api/inference/model_builder.rst
+16
diff --git a/‎doc/overview.rst
+114 b/‎doc/overview.rst
+114
diff --git a/‎inference-experience-dev-tester.sh
+53 b/‎inference-experience-dev-tester.sh
+53
diff --git a/‎requirements/extras/test_requirements.txt
+9 b/‎requirements/extras/test_requirements.txt
+9
diff --git a/‎setup.py
+25-1 b/‎setup.py
+25-1
diff --git a/‎src/sagemaker/base_deserializers.py
+79 b/‎src/sagemaker/base_deserializers.py
+79
@@ -34,3 +34,4 @@ env/
 **/_repack_script_launcher.sh
 tests/data/**/_repack_model.py
 tests/data/experiment/sagemaker-dev-1.0.tar.gz
+src/sagemaker/serve/tmp_workspace
@@ -5955,4 +5955,4 @@
 
 ## 1.0.0
 
-* Initial commit
+* Initial commit
@@ -1,6 +1,7 @@
 recursive-include src/sagemaker *.py
 
 include src/sagemaker/image_uri_config/*.json
+include src/sagemaker/serve/requirements.txt
 recursive-include requirements *
 
 include VERSION
 
@@ -0,0 +1,16 @@
+Model Builder
+-------------
+
+This module contains classes related to Amazon Sagemaker Model Builder
+
+.. autoclass:: sagemaker.serve.builder.model_builder.ModelBuilder
+
+.. automethod:: sagemaker.serve.builder.model_builder.ModelBuilder.build
+
+.. automethod:: sagemaker.serve.builder.model_builder.ModelBuilder.save
+
+.. autoclass:: sagemaker.serve.spec.inference_spec.InferenceSpec
+
+.. autoclass:: sagemaker.serve.builder.schema_builder.SchemaBuilder
+
+.. autoclass:: sagemaker.serve.marshalling.custom_payload_translator.CustomPayloadTranslator
@@ -820,6 +820,120 @@ the
 
    predictor.predict("this is the best day of my life", {"ContentType": "application/x-text"})
 
+Deploy a pre-trained model using the SageMaker ModelBuilder class
+-----------------------------------------------------------------
+If you prefer a streamlined solution to build and deploy your model, SageMaker Python SDK offers additional APIs that apply intelligent defaults to help you create a SageMaker-deployable model in fewer steps. ``ModelBuilder`` simplifies model creation by performing the following tasks for you:
+
+- Converts machine learning models trained using various frameworks like XgBoost or PyTorch into SageMaker-deployable models with a single line of code.
+- Performs automatic container selection based on the model framework, so you don’t have to manually specify your container. You can still bring your own container by passing your own URI to `ModelBuilder`.
+- Handles the serializing of data on the client side before sending it to the server for inference, and deserializing the results returned by the server. Data is correctly formatted without manual processing.
+- Automated capture of dependencies, libraries and packages needed by your model. You don’t have to package the dependencies and upload to S3. The deployment environment matches the development environment to ensure a smooth transition from development to deployment.
+
+Build your model with ModelBuilder
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``ModelBuilder`` class can take your framework (Xgboost or Pytorch) model and convert it into a SageMaker-deployable model, generating the artifacts according the model server. If you don’t want to supply a model directly, you can provide inference code to specify a model source - this method is discussed in the following sections. There are many available options for `ModelBuilder`, but if your model doesn’t require extensive customization and you want to deploy immediately, you can supply at minimum a framework model, input, and output.  If you did not set up a default role ARN, you need to provide it as well.  To view all options offered in `ModelBuilder`, see the `ModelBuilder documentation
+<https://sagemaker.readthedocs.io/en/stable/api/inference/model_builder.html>`_.
+
+In the following code example, ``ModelBuilder`` is called with a framework model and an instance of ``SchemaBuilder`` with minimum arguments (to infer the corresponding functions for serializing and deserializing the endpoint input and output). No container is specified and no packaged dependencies are passed - SageMaker saves you preparation time and effort by automatically inferring these resources when you build your model.
+
+.. code:: python
+
+    model_builder = ModelBuilder(
+        model=model,
+        schema_builder=SchemaBuilder(X_test, y_pred),
+    )
+
+    model_builder.build()
+
+If you want to bring your own container, you can also specify the image URI and set the mode argument to ``Mode.LOCAL_CONTAINER`` as shown in the following example. When you want to deploy to SageMaker, change the argument to ``Mode.SAGEMAKER_ENDPOINT``.
+
+.. code:: python
+
+    model_builder = ModelBuilder(
+        model=model,
+        model_server=ModelServer.TORCHSERVE,
+        schema_builder=SchemaBuilder(X_test, y_pred),
+        image_uri="12345678910.dkr.ecr.ap-southeast-2.amazonaws.com/byoc-image:xgb-1.7-1")
+    )
+
+Define serialization and deserialization methods with SchemaBuilder
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``SchemaBuilder`` accepts sample input and output and can infer corresponding functions for serializing and deserializing the endpoint input and output. For example, the following example would work:
+
+.. code:: python
+
+    input = "How is the demo going?"
+    output = "Comment la démo va-t-elle?"
+    schema = SchemaBuilder(input, output)
+
+However, you might want to further customize your serialization and deserialization functions. For example, you might want to pass an image and want your serializer to converts the image to a tensor before prediction. You can define this translation in ``CustomPayloadTranslator`` for input, output, or both, and pass them to ``SchemaBuilder``. For an example that creates custom input and output translators with ``CustomPayloadTranslator``, see `ModelBuilder examples
+<https://github.com/aws-samples/sagemaker-hosting/SageMaker-Model-Builder>`_.
+
+.. code:: python
+
+    class SchemaBuilder(
+        sample_input: Any,
+        sample_output: Any,
+        input_translator: CustomPayloadTranslator = None,
+        output_translator: CustomPayloadTranslator = None
+    )
+
+You can use ``SchemaBuilder`` to standardize the serialization and deserialization functions for endpoint input and output for a model server. Insert those functions into a one definition of ``SchemaBuilder`` and pass this definition to all instances of ``ModelBuilder`` for the same model server. As a result, you no longer have to maintain common implementation details in individualized scripts for each model.
+
+Load the model with a custom function using InferenceSpec
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+As previously mentioned, you don’t have to supply a framework model directly to ``ModelBuilder``.  You can instead pass an instance of ``InferenceSpec`` with the load and invoke functions defined.  The load function contains the custom logic that creates the model, and invoke instructs SageMaker how to pass the input payload to the endpoint.
+
+The following example uses ``InferenceSpec`` to generate a model with the HuggingFace pipeline.
+
+.. code:: python
+
+    from sagemaker.serve.spec.inference_spec import InferenceSpec
+    from transformers import pipeline
+
+    class MyInferenceSpec(InferenceSpec):
+        def load(self, model_dir: str):
+            return pipeline("translation_en_to_fr", model="t5-small")
+
+        def invoke(self, input, model):
+            return model(input)
+
+    inf_spec = MyInferenceSpec()
+
+    model_builder = ModelBuilder(
+        inference_spec=my_inference_spec,
+        schema_builder=SchemaBuilder(X_test, y_pred)
+    )
+
+For sample notebooks that demonstrate the use of ``InferenceSpec``, see `ModelBuilder
+examples <https://github.com/aws-samples/sagemaker-hosting/SageMaker-Model-Builder>`_.
+
+Build your model and deploy
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Call the build() function to create your deployable model.  This step creates an inference.py in your working directory with the code necessary to create your schema, run serialization and deserialization of inputs and outputs, and perform other user-specified custom logic.
+
+.. code:: python
+
+    # Build the model according to the model server specification and save it to as files in the working directory
+    model = model_builder.build()
+
+Deploy your model with the model’s existing ``deploy()`` method.  A model constructed from ``ModelBuilder`` enables live logging during deployment as an added feature.
+
+.. code:: python
+
+    predictor = model.deploy(
+        initial_instance_count=1,
+        instance_type="ml.c6i.xlarge"
+    )
+
+For more examples of using ``ModelBuilder`` to build your models, see
+  `ModelBuilder sample notebooks <https://github.com/aws-samples/sagemaker-hosting/SageMaker-Model-Builder>`_.
+
+
 Fine-tune a Model and Deploy to a SageMaker Endpoint
 ====================================================
 
 
@@ -0,0 +1,53 @@
+trap "exit" INT
+
+eval "$(conda shell.zsh hook)"
+
+conda activate $1
+
+echo Installing tox if necessary
+pip install --upgrade tox
+
+echo Running unit testing in Python3.8 Conda Environement
+
+if "$4";
+then
+    echo Clearing .tox cache for Python3.8
+    rm -r .tox/py38
+fi
+
+tox -e py38 -- tests/unit/sagemaker/serve/.
+
+if "$3";
+then
+    echo Running Python3.8 Integration Tests
+    tox -e py38 -- tests/integ/sagemaker/serve/.
+fi
+
+conda deactivate
+
+conda activate $2
+
+echo Installing tox if necessary
+pip install --upgrade tox
+
+echo Running unit testing in Python3.10 Conda Environment
+
+if "$4";
+then
+    echo Clearing .tox cache for Python3.10
+    rm -r .tox/py10
+fi
+
+tox -e py310 -- tests/unit/sagemaker/serve/. 
+
+if "$3";
+then
+    echo Running Python3.10 Integration Tests
+    tox -e py310 -- tests/integ/sagemaker/serve/.
+fi
+
+conda deactivate
+
+echo Coverage report after testing:
+
+coverage report -i --fail-under=75 --include "*/serve/*" --omit '*in_process*,*interceptors*,*__init__*,*build_model*,*function_pointers*'
@@ -29,3 +29,12 @@ docker>=5.0.2,<7.0.0
 PyYAML==6.0
 pyspark==3.3.1
 sagemaker-feature-store-pyspark-3.3
+# TODO find workaround
+xgboost>=1.6.2,<=1.7.6
+pillow>=9.5.0,<=10.0.0
+torch@https://download.pytorch.org/whl/cpu/torch-2.0.0%2Bcpu-cp310-cp310-linux_x86_64.whl
+torchvision@https://download.pytorch.org/whl/cpu/torchvision-0.15.1%2Bcpu-cp310-cp310-linux_x86_64.whl
+transformers==4.32.0
+sentencepiece==0.1.99
+# https://github.com/triton-inference-server/server/issues/6246
+tritonclient[http]<2.37.0
@@ -15,6 +15,7 @@
 
 import os
 from glob import glob
+import sys
 
 from setuptools import find_packages, setup
 
@@ -63,6 +64,12 @@ def read_requirements(filename):
     "jsonschema",
     "platformdirs",
     "tblib==1.7.0",
+    "urllib3<1.27",
+    "uvicorn==0.22.0",
+    "fastapi==0.95.2",
+    "requests",
+    "docker",
+    "tqdm",
 ]
 
 # Specific use case dependencies
@@ -77,7 +84,21 @@ def read_requirements(filename):
 # Meta dependency groups
 extras["all"] = [item for group in extras.values() for item in group]
 # Tests specific dependencies (do not need to be included in 'all')
-extras["test"] = (read_requirements("requirements/extras/test_requirements.txt"),)
+test_dependencies = read_requirements("requirements/extras/test_requirements.txt")
+# remove torch and torchvision if python version is not 3.10
+if sys.version_info.minor != 10:
+    test_dependencies = [
+        module
+        for module in test_dependencies
+        if not (
+            module.startswith("transformers")
+            or module.startswith("sentencepiece")
+            or module.startswith("torch")
+            or module.startswith("torchvision")
+        )
+    ]
+
+extras["test"] = (test_dependencies,)
 
 setup(
     name="sagemaker",
@@ -110,4 +131,7 @@ def read_requirements(filename):
             "sagemaker-upgrade-v2=sagemaker.cli.compatibility.v2.sagemaker_upgrade_v2:main",
         ]
     },
+    scripts=[
+        "src/sagemaker/serve/model_server/triton/pack_conda_env.sh",
+    ],
 )
@@ -22,6 +22,7 @@
 
 import numpy as np
 from six import with_metaclass
+import cloudpickle
 
 from sagemaker.utils import DeferredError
 
@@ -332,3 +333,81 @@ def deserialize(self, stream, content_type):
             return [json.loads(line) for line in lines]
         finally:
             stream.close()
+
+
+class TorchTensorDeserializer(SimpleBaseDeserializer):
+    """Deserialize stream to torch.Tensor.
+
+    Args:
+        stream (botocore.response.StreamingBody): Data to be deserialized.
+        content_type (str): The MIME type of the data.
+
+    Returns:
+        torch.Tensor: The data deserialized into a torch Tensor.
+    """
+
+    def __init__(self, accept="tensor/pt"):
+        super(TorchTensorDeserializer, self).__init__(accept=accept)
+        self.numpy_deserializer = NumpyDeserializer()
+        try:
+            from torch import from_numpy
+
+            self.convert_npy_to_tensor = from_numpy
+        except ImportError:
+            raise Exception("Unable to import pytorch.")
+
+    def deserialize(self, stream, content_type="tensor/pt"):
+        """Deserialize streamed data to TorchTensor
+
+        See https://pytorch.org/docs/stable/generated/torch.from_numpy.html
+
+        Args:
+            stream (botocore.response.StreamingBody): Data to be deserialized.
+            content_type (str): The MIME type of the data.
+
+        Returns:
+            list: A list of TorchTensor serializable objects.
+        """
+        try:
+            numpy_array = self.numpy_deserializer.deserialize(
+                stream=stream, content_type="application/x-npy"
+            )
+            return self.convert_npy_to_tensor(numpy_array)
+        except Exception:
+            raise ValueError(
+                "Unable to deserialize your data to torch.Tensor.\
+                    Please provide custom deserializer in InferenceSpec."
+            )
+
+
+class PickleDeserializer(SimpleBaseDeserializer):
+    """Deserialize stream to object using cloudpickle module.
+
+    Args:
+           stream (botocore.response.StreamingBody): Data to be deserialized.
+           content_type (str): The MIME type of the data.
+
+       Returns:
+           object: The data deserialized into a torch Tensor.
+    """
+
+    def __init__(self, accept="application/x-pkl"):
+        super(PickleDeserializer, self).__init__(accept)
+
+    def deserialize(self, stream, content_type="application/x-pkl"):
+        """Deserialize pickle data from an inference endpoint.
+
+        Args:
+            stream (botocore.response.StreamingBody): Data to be deserialized.
+            content_type (str): The MIME type of the data.
+
+        Returns:
+            list: A list of piclke serializable objects.
+        """
+        try:
+            return cloudpickle.loads(stream.read())
+        except Exception:
+            raise ValueError(
+                "Cannot deserialize bytes to object with cloudpickle.\
+                    Please provide custom deserializer."
+            )
Original file line number	Diff line number	Diff line change
`@@ -5955,4 +5955,4 @@`
`5955`	`5955`
`5956`	`5956`	`## 1.0.0`
`5957`	`5957`
`5958`		`-* Initial commit`
	`5958`	`+* Initial commit`