Add EI Dockerfile for 1.11 (#163)

ChoiByungWook · web-flow · commit 79d7a579adb0 · 2019-02-11T17:08:53.000-08:00
diff --git a/README.rst b/README.rst
@@ -134,16 +134,15 @@ Then run:
     # All build instructions assumes you're building from the same directory as the Dockerfile.
 
     # CPU
-    docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu .
+    docker build -t <image_name>:<tag> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu .
 
     # GPU
-    docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu .
+    docker build -t <image_name>:<tag> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu .
 
 ::
 
     # Example
-    docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg py_version=2 \
-    --build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
+    docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
 
 The dockerfiles for 1.4 and 1.5 build from source instead, so when building those, you don't need to download the wheel beforehand:
 
@@ -188,7 +187,7 @@ versions of the frameworks are automatically built into containers when you use
 download them as binary files and import them into your own Docker containers. The enhanced TensorFlow serving binaries are available on Amazon S3 at https://s3.console.aws.amazon.com/s3/buckets/amazonei-tensorflow.
 
 The SageMaker TensorFlow containers with Amazon Elastic Inference support were built from the
-`EI Dockerfile <https://github.com/aws/sagemaker-tensorflow-container/blob/master/docker/1.12.0/final/py2/Dockerfile.ei>`__ starting at TensorFlow 1.12.0 and above.
+`EI Dockerfile <https://github.com/aws/sagemaker-tensorflow-container/blob/master/docker/1.11.0/final/py2/Dockerfile.ei>`__ starting at TensorFlow 1.11.0 and above.
 
 The instructions for building the SageMaker TensorFlow containers with Amazon Elastic Inference support are similar to the steps `above <https://github.com/aws/sagemaker-tensorflow-container#final-images>`__.
 
@@ -197,9 +196,9 @@ The only difference is the addition of the ``tensorflow_model_server`` build-arg
 ::
 
     # Example
-    docker build -t preprod-tensorflow-ei:1.12.0-cpu-py2 --build-arg py_version=2 \
-    --build-arg tensorflow_model_server AmazonEI_TensorFlow_Serving_v1.12_v1 \
-    --build-arg framework_installable=tensorflow-1.12.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
+    docker build -t preprod-tensorflow-ei:1.11.0-cpu-py2 \
+    --build-arg tensorflow_model_server AmazonEI_TensorFlow_Serving_v1.11_v1 \
+    --build-arg framework_installable=tensorflow-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
 
 
 * For information about downloading the enhanced versions of TensorFlow serving, see `Using TensorFlow Models with Amazon EI <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ei-tensorflow.html>`__.
@@ -273,10 +272,10 @@ Functional Tests
 Functional tests require your Docker image to be within an `Amazon ECR repository <https://docs
 .aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html>`__.
 
-The Docker-base-name is your `ECR repository namespace <https://docs.aws.amazon
+The `docker-base-name` is your `ECR repository namespace <https://docs.aws.amazon
 .com/AmazonECR/latest/userguide/Repositories.html>`__.
 
-The instance-type is your specified `Amazon SageMaker Instance Type
+The `instance-type` is your specified `Amazon SageMaker Instance Type
 <https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ that the functional test will run on.
 
 
@@ -292,7 +291,6 @@ SageMaker <https://aws.amazon.com/sagemaker/>`__, then use:
 ::
 
     # Required arguments for integration tests are found in test/functional/conftest.py
-
     pytest test/functional --aws-id <your_aws_id> \
                            --docker-base-name <your_docker_image> \
                            --instance-type <amazon_sagemaker_instance_type> \
@@ -306,6 +304,19 @@ SageMaker <https://aws.amazon.com/sagemaker/>`__, then use:
                            --instance-type ml.m4.xlarge \
                            --tag 1.0
 
+If you want to run a functional end to end test for your Elastic Inference container, you will need to provide an `accelerator_type` as an additional pytest argument.
+
+The `accelerator-type` is your specified `Amazon Elastic Inference Accelerator <https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ type that will be attached to your instance type.
+
+::
+
+    # Example for running Elastic Inference functional test
+    pytest test/functional/test_elastic_inference.py --aws-id 12345678910 \
+                                                     --docker-base-name preprod-tensorflow \
+                                                     --instance-type ml.m4.xlarge \
+                                                     --accelerator-type ml.eia1.medium \
+                                                     --tag 1.0
+
 Contributing
 ------------
 
diff --git a/docker/1.11.0/final/py2/Dockerfile.ei b/docker/1.11.0/final/py2/Dockerfile.ei
@@ -0,0 +1,80 @@
+FROM ubuntu:16.04
+
+MAINTAINER Amazon AI
+
+ARG framework_installable
+ARG framework_support_installable=sagemaker_tensorflow_container-1.0.0.tar.gz
+ARG tensorflow_model_server
+
+WORKDIR /root
+
+COPY $framework_installable .
+COPY $framework_support_installable .
+
+RUN apt-get update && apt-get install -y --no-install-recommends \
+        build-essential \
+        curl \
+        git \
+        libcurl3-dev \
+        libfreetype6-dev \
+        libpng12-dev \
+        libzmq3-dev \
+        pkg-config \
+        python-dev \
+        rsync \
+        software-properties-common \
+        unzip \
+        zip \
+        zlib1g-dev \
+        openjdk-8-jdk \
+        openjdk-8-jre-headless \
+        wget \
+        vim \
+        iputils-ping \
+        nginx \
+        && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
+    python get-pip.py && \
+    rm get-pip.py
+
+RUN pip --no-cache-dir install \
+        numpy \
+        scipy \
+        sklearn \
+        pandas \
+        Pillow \
+        h5py
+
+# TODO: upgrade to tf serving 1.8, which requires more work with updating
+# dependencies. See current work in progress in tfserving-1.8 branch.
+ENV TF_SERVING_VERSION=1.7.0
+
+RUN pip install numpy boto3 six awscli flask==0.11 Jinja2==2.9 tensorflow-serving-api==$TF_SERVING_VERSION gevent gunicorn
+
+# Install TF Serving pkg
+COPY $tensorflow_model_server /usr/bin/tensorflow_model_server
+
+# Update libstdc++6, as required by tensorflow-serving >= 1.6: https://github.com/tensorflow/serving/issues/819
+RUN add-apt-repository ppa:ubuntu-toolchain-r/test -y && \
+    apt-get update && \
+    apt-get install -y libstdc++6
+
+RUN framework_installable_local=$(basename $framework_installable) && \
+    framework_support_installable_local=$(basename $framework_support_installable) && \
+    \
+    pip install --no-cache --upgrade $framework_installable_local && \
+    pip install $framework_support_installable_local && \
+    pip install "sagemaker-tensorflow>=1.10,<1.11" &&\
+    \
+    rm $framework_installable_local && \
+    rm $framework_support_installable_local
+
+# Set environment variables for MKL
+# TODO: investigate the right value for OMP_NUM_THREADS
+ENV KMP_AFFINITY=granularity=fine,compact,1,0 KMP_BLOCKTIME=1 KMP_SETTINGS=0
+
+# entry.py comes from sagemaker-container-support
+ENTRYPOINT ["entry.py"]
diff --git a/src/tf_container/serve.py b/src/tf_container/serve.py
@@ -259,8 +259,7 @@ def _default_input_fn(self, serialized_data, content_type):
 
     @classmethod
     def from_module(cls, m, grpc_proxy_client):
-        """Initialize a Transformer using functions supplied by the given module. The module
-        must supply a ``model_fn()`` that returns an MXNet Module.
+        """Initialize a Transformer using functions supplied by the given module.
 
         If the module contains a ``transform_fn``, it will be used to handle incoming request
         data, execute the model prediction, and generation of response content.
diff --git a/test/functional/__init__.py b/test/functional/__init__.py
@@ -1,13 +1,16 @@
 #  Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#  
+#
 #  Licensed under the Apache License, Version 2.0 (the "License").
 #  You may not use this file except in compliance with the License.
 #  A copy of the License is located at
-#  
+#
 #      http://www.apache.org/licenses/LICENSE-2.0
-#  
-#  or in the "license" file accompanying this file. This file is distributed 
-#  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
-#  express or implied. See the License for the specific language governing 
+#
+#  or in the "license" file accompanying this file. This file is distributed
+#  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
+#  express or implied. See the License for the specific language governing
 #  permissions and limitations under the License.
 
+# EI is currently only supported in the following regions
+# regions were derived from https://aws.amazon.com/machine-learning/elastic-inference/pricing/
+EI_SUPPORTED_REGIONS = ['us-east-1', 'us-east-2', 'us-west-2', 'eu-west-1', 'ap-northeast-1', 'ap-northeast-2']
diff --git a/test/functional/conftest.py b/test/functional/conftest.py
@@ -1,14 +1,14 @@
 #  Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#  
+#
 #  Licensed under the Apache License, Version 2.0 (the "License").
 #  You may not use this file except in compliance with the License.
 #  A copy of the License is located at
-#  
+#
 #      http://www.apache.org/licenses/LICENSE-2.0
-#  
-#  or in the "license" file accompanying this file. This file is distributed 
-#  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
-#  express or implied. See the License for the specific language governing 
+#
+#  or in the "license" file accompanying this file. This file is distributed
+#  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
+#  express or implied. See the License for the specific language governing
 #  permissions and limitations under the License.
 
 import logging
@@ -29,6 +29,7 @@ def pytest_addoption(parser):
     parser.addoption('--aws-id')
     parser.addoption('--docker-base-name', default='preprod-tensorflow')
     parser.addoption('--instance-type')
+    parser.addoption('--accelerator-type', default=None)
     parser.addoption('--region', default='us-west-2')
     parser.addoption('--tag')
 
@@ -48,6 +49,11 @@ def instance_type(request):
     return request.config.getoption('--instance-type')
 
 
+@pytest.fixture(scope='session')
+def accelerator_type(request):
+    return request.config.getoption('--accelerator-type')
+
+
 @pytest.fixture(scope='session')
 def region(request):
     return request.config.getoption('--region')
diff --git a/test/functional/test_elastic_inference.py b/test/functional/test_elastic_inference.py
@@ -0,0 +1,78 @@
+#  Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
+#
+#  Licensed under the Apache License, Version 2.0 (the "License").
+#  You may not use this file except in compliance with the License.
+#  A copy of the License is located at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  or in the "license" file accompanying this file. This file is distributed
+#  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
+#  express or implied. See the License for the specific language governing
+#  permissions and limitations under the License.
+
+import logging
+import os
+
+import numpy as np
+import pytest
+from sagemaker.tensorflow import TensorFlowModel
+from sagemaker.utils import sagemaker_timestamp
+
+from test.functional import EI_SUPPORTED_REGIONS
+from test.integ.conftest import SCRIPT_PATH
+from test.resources.python_sdk.timeout import timeout_and_delete_endpoint_by_name
+
+logger = logging.getLogger(__name__)
+logging.getLogger('boto3').setLevel(logging.INFO)
+logging.getLogger('botocore').setLevel(logging.INFO)
+logging.getLogger('factory.py').setLevel(logging.INFO)
+logging.getLogger('auth.py').setLevel(logging.INFO)
+logging.getLogger('connectionpool.py').setLevel(logging.INFO)
+logging.getLogger('session.py').setLevel(logging.DEBUG)
+logging.getLogger('sagemaker').setLevel(logging.DEBUG)
+
+
+@pytest.fixture(autouse=True)
+def skip_if_no_accelerator(accelerator_type):
+    if accelerator_type is None:
+        pytest.skip('Skipping because accelerator type was not provided')
+
+
+@pytest.fixture(autouse=True)
+def skip_if_non_supported_ei_region(region):
+    if region not in EI_SUPPORTED_REGIONS:
+        pytest.skip('EI is not supported in {}'.format(region))
+
+
+@pytest.fixture
+def pretrained_model_data(region):
+    return 's3://sagemaker-sample-data-{}/tensorflow/model/resnet/resnet_50_v2_fp32_NCHW.tar.gz'.format(region)
+
+
+# based on https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_using_elastic_inference_with_your_own_model/tensorflow_pretrained_model_elastic_inference.ipynb
+@pytest.mark.skip_if_non_supported_ei_region
+@pytest.mark.skip_if_no_accelerator
+def test_deploy_elastic_inference_with_pretrained_model(pretrained_model_data, docker_image_uri, sagemaker_session, instance_type, accelerator_type):
+    resource_path = os.path.join(SCRIPT_PATH, '../resources')
+    endpoint_name = 'test-tf-ei-deploy-model-{}'.format(sagemaker_timestamp())
+
+    with timeout_and_delete_endpoint_by_name(endpoint_name=endpoint_name, sagemaker_session=sagemaker_session,
+                                             minutes=20):
+        tensorflow_model = TensorFlowModel(model_data=pretrained_model_data,
+                                           entry_point='default_entry_point.py',
+                                           source_dir=resource_path,
+                                           role='SageMakerRole',
+                                           image=docker_image_uri,
+                                           sagemaker_session=sagemaker_session)
+
+        logger.info('deploying model to endpoint: {}'.format(endpoint_name))
+        predictor = tensorflow_model.deploy(initial_instance_count=1,
+                                            instance_type=instance_type,
+                                            accelerator_type=accelerator_type,
+                                            endpoint_name=endpoint_name)
+
+        random_input = np.random.rand(1, 1, 3, 3)
+
+        predict_response = predictor.predict({'input': random_input.tolist()})
+        assert predict_response['outputs']['probabilities']
diff --git a/test/integ/conftest.py b/test/integ/conftest.py
@@ -1,14 +1,14 @@
 #  Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
-#  
+#
 #  Licensed under the Apache License, Version 2.0 (the "License").
 #  You may not use this file except in compliance with the License.
 #  A copy of the License is located at
-#  
+#
 #      http://www.apache.org/licenses/LICENSE-2.0
-#  
-#  or in the "license" file accompanying this file. This file is distributed 
-#  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either 
-#  express or implied. See the License for the specific language governing 
+#
+#  or in the "license" file accompanying this file. This file is distributed
+#  on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
+#  express or implied. See the License for the specific language governing
 #  permissions and limitations under the License.
 
 import logging
@@ -37,7 +37,7 @@ def pytest_addoption(parser):
     parser.addoption('--tag', required=True)
     parser.addoption('--region', default='us-west-2')
     parser.addoption('--framework-version', required=True)
-    parser.addoption('--processor', required=True, choices=['gpu','cpu'])
+    parser.addoption('--processor', required=True, choices=['gpu', 'cpu'])
 
 
 @pytest.fixture(scope='session')
diff --git a/test/resources/default_entry_point.py b/test/resources/default_entry_point.py
@@ -0,0 +1 @@
+# use default SageMaker defined ``input_fn``, ``predict_fn``, and ``output_fn``

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	+# use default SageMaker defined ``input_fn``, ``predict_fn``, and ``output_fn``