Skip to content

Commit 79d7a57

Browse files
Add EI Dockerfile for 1.11 (#163)
1 parent a3230ca commit 79d7a57

File tree

8 files changed

+210
-32
lines changed

8 files changed

+210
-32
lines changed

README.rst

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -134,16 +134,15 @@ Then run:
134134
# All build instructions assumes you're building from the same directory as the Dockerfile.
135135

136136
# CPU
137-
docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu .
137+
docker build -t <image_name>:<tag> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.cpu .
138138

139139
# GPU
140-
docker build -t <image_name>:<tag> --build-arg py_version=<py_version> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu .
140+
docker build -t <image_name>:<tag> --build-arg framework_installable=<path to tensorflow binary> -f Dockerfile.gpu .
141141

142142
::
143143

144144
# Example
145-
docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg py_version=2 \
146-
--build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
145+
docker build -t preprod-tensorflow:1.6.0-cpu-py2 --build-arg framework_installable=tensorflow-1.6.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
147146

148147
The dockerfiles for 1.4 and 1.5 build from source instead, so when building those, you don't need to download the wheel beforehand:
149148

@@ -188,7 +187,7 @@ versions of the frameworks are automatically built into containers when you use
188187
download them as binary files and import them into your own Docker containers. The enhanced TensorFlow serving binaries are available on Amazon S3 at https://s3.console.aws.amazon.com/s3/buckets/amazonei-tensorflow.
189188

190189
The SageMaker TensorFlow containers with Amazon Elastic Inference support were built from the
191-
`EI Dockerfile <https://github.com/aws/sagemaker-tensorflow-container/blob/master/docker/1.12.0/final/py2/Dockerfile.ei>`__ starting at TensorFlow 1.12.0 and above.
190+
`EI Dockerfile <https://github.com/aws/sagemaker-tensorflow-container/blob/master/docker/1.11.0/final/py2/Dockerfile.ei>`__ starting at TensorFlow 1.11.0 and above.
192191

193192
The instructions for building the SageMaker TensorFlow containers with Amazon Elastic Inference support are similar to the steps `above <https://github.com/aws/sagemaker-tensorflow-container#final-images>`__.
194193

@@ -197,9 +196,9 @@ The only difference is the addition of the ``tensorflow_model_server`` build-arg
197196
::
198197

199198
# Example
200-
docker build -t preprod-tensorflow-ei:1.12.0-cpu-py2 --build-arg py_version=2 \
201-
--build-arg tensorflow_model_server AmazonEI_TensorFlow_Serving_v1.12_v1 \
202-
--build-arg framework_installable=tensorflow-1.12.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
199+
docker build -t preprod-tensorflow-ei:1.11.0-cpu-py2 \
200+
--build-arg tensorflow_model_server AmazonEI_TensorFlow_Serving_v1.11_v1 \
201+
--build-arg framework_installable=tensorflow-1.11.0-cp27-cp27mu-manylinux1_x86_64.whl -f Dockerfile.cpu .
203202

204203

205204
* For information about downloading the enhanced versions of TensorFlow serving, see `Using TensorFlow Models with Amazon EI <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ei-tensorflow.html>`__.
@@ -273,10 +272,10 @@ Functional Tests
273272
Functional tests require your Docker image to be within an `Amazon ECR repository <https://docs
274273
.aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html>`__.
275274

276-
The Docker-base-name is your `ECR repository namespace <https://docs.aws.amazon
275+
The `docker-base-name` is your `ECR repository namespace <https://docs.aws.amazon
277276
.com/AmazonECR/latest/userguide/Repositories.html>`__.
278277

279-
The instance-type is your specified `Amazon SageMaker Instance Type
278+
The `instance-type` is your specified `Amazon SageMaker Instance Type
280279
<https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ that the functional test will run on.
281280

282281

@@ -292,7 +291,6 @@ SageMaker <https://aws.amazon.com/sagemaker/>`__, then use:
292291
::
293292

294293
# Required arguments for integration tests are found in test/functional/conftest.py
295-
296294
pytest test/functional --aws-id <your_aws_id> \
297295
--docker-base-name <your_docker_image> \
298296
--instance-type <amazon_sagemaker_instance_type> \
@@ -306,6 +304,19 @@ SageMaker <https://aws.amazon.com/sagemaker/>`__, then use:
306304
--instance-type ml.m4.xlarge \
307305
--tag 1.0
308306

307+
If you want to run a functional end to end test for your Elastic Inference container, you will need to provide an `accelerator_type` as an additional pytest argument.
308+
309+
The `accelerator-type` is your specified `Amazon Elastic Inference Accelerator <https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ type that will be attached to your instance type.
310+
311+
::
312+
313+
# Example for running Elastic Inference functional test
314+
pytest test/functional/test_elastic_inference.py --aws-id 12345678910 \
315+
--docker-base-name preprod-tensorflow \
316+
--instance-type ml.m4.xlarge \
317+
--accelerator-type ml.eia1.medium \
318+
--tag 1.0
319+
309320
Contributing
310321
------------
311322

docker/1.11.0/final/py2/Dockerfile.ei

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
FROM ubuntu:16.04
2+
3+
MAINTAINER Amazon AI
4+
5+
ARG framework_installable
6+
ARG framework_support_installable=sagemaker_tensorflow_container-1.0.0.tar.gz
7+
ARG tensorflow_model_server
8+
9+
WORKDIR /root
10+
11+
COPY $framework_installable .
12+
COPY $framework_support_installable .
13+
14+
RUN apt-get update && apt-get install -y --no-install-recommends \
15+
build-essential \
16+
curl \
17+
git \
18+
libcurl3-dev \
19+
libfreetype6-dev \
20+
libpng12-dev \
21+
libzmq3-dev \
22+
pkg-config \
23+
python-dev \
24+
rsync \
25+
software-properties-common \
26+
unzip \
27+
zip \
28+
zlib1g-dev \
29+
openjdk-8-jdk \
30+
openjdk-8-jre-headless \
31+
wget \
32+
vim \
33+
iputils-ping \
34+
nginx \
35+
&& \
36+
apt-get clean && \
37+
rm -rf /var/lib/apt/lists/*
38+
39+
RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && \
40+
python get-pip.py && \
41+
rm get-pip.py
42+
43+
RUN pip --no-cache-dir install \
44+
numpy \
45+
scipy \
46+
sklearn \
47+
pandas \
48+
Pillow \
49+
h5py
50+
51+
# TODO: upgrade to tf serving 1.8, which requires more work with updating
52+
# dependencies. See current work in progress in tfserving-1.8 branch.
53+
ENV TF_SERVING_VERSION=1.7.0
54+
55+
RUN pip install numpy boto3 six awscli flask==0.11 Jinja2==2.9 tensorflow-serving-api==$TF_SERVING_VERSION gevent gunicorn
56+
57+
# Install TF Serving pkg
58+
COPY $tensorflow_model_server /usr/bin/tensorflow_model_server
59+
60+
# Update libstdc++6, as required by tensorflow-serving >= 1.6: https://github.com/tensorflow/serving/issues/819
61+
RUN add-apt-repository ppa:ubuntu-toolchain-r/test -y && \
62+
apt-get update && \
63+
apt-get install -y libstdc++6
64+
65+
RUN framework_installable_local=$(basename $framework_installable) && \
66+
framework_support_installable_local=$(basename $framework_support_installable) && \
67+
\
68+
pip install --no-cache --upgrade $framework_installable_local && \
69+
pip install $framework_support_installable_local && \
70+
pip install "sagemaker-tensorflow>=1.10,<1.11" &&\
71+
\
72+
rm $framework_installable_local && \
73+
rm $framework_support_installable_local
74+
75+
# Set environment variables for MKL
76+
# TODO: investigate the right value for OMP_NUM_THREADS
77+
ENV KMP_AFFINITY=granularity=fine,compact,1,0 KMP_BLOCKTIME=1 KMP_SETTINGS=0
78+
79+
# entry.py comes from sagemaker-container-support
80+
ENTRYPOINT ["entry.py"]

src/tf_container/serve.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -259,8 +259,7 @@ def _default_input_fn(self, serialized_data, content_type):
259259

260260
@classmethod
261261
def from_module(cls, m, grpc_proxy_client):
262-
"""Initialize a Transformer using functions supplied by the given module. The module
263-
must supply a ``model_fn()`` that returns an MXNet Module.
262+
"""Initialize a Transformer using functions supplied by the given module.
264263
265264
If the module contains a ``transform_fn``, it will be used to handle incoming request
266265
data, execute the model prediction, and generation of response content.

test/functional/__init__.py

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,16 @@
11
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
2-
#
2+
#
33
# Licensed under the Apache License, Version 2.0 (the "License").
44
# You may not use this file except in compliance with the License.
55
# A copy of the License is located at
6-
#
6+
#
77
# http://www.apache.org/licenses/LICENSE-2.0
8-
#
9-
# or in the "license" file accompanying this file. This file is distributed
10-
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11-
# express or implied. See the License for the specific language governing
8+
#
9+
# or in the "license" file accompanying this file. This file is distributed
10+
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11+
# express or implied. See the License for the specific language governing
1212
# permissions and limitations under the License.
1313

14+
# EI is currently only supported in the following regions
15+
# regions were derived from https://aws.amazon.com/machine-learning/elastic-inference/pricing/
16+
EI_SUPPORTED_REGIONS = ['us-east-1', 'us-east-2', 'us-west-2', 'eu-west-1', 'ap-northeast-1', 'ap-northeast-2']

test/functional/conftest.py

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
2-
#
2+
#
33
# Licensed under the Apache License, Version 2.0 (the "License").
44
# You may not use this file except in compliance with the License.
55
# A copy of the License is located at
6-
#
6+
#
77
# http://www.apache.org/licenses/LICENSE-2.0
8-
#
9-
# or in the "license" file accompanying this file. This file is distributed
10-
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11-
# express or implied. See the License for the specific language governing
8+
#
9+
# or in the "license" file accompanying this file. This file is distributed
10+
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11+
# express or implied. See the License for the specific language governing
1212
# permissions and limitations under the License.
1313

1414
import logging
@@ -29,6 +29,7 @@ def pytest_addoption(parser):
2929
parser.addoption('--aws-id')
3030
parser.addoption('--docker-base-name', default='preprod-tensorflow')
3131
parser.addoption('--instance-type')
32+
parser.addoption('--accelerator-type', default=None)
3233
parser.addoption('--region', default='us-west-2')
3334
parser.addoption('--tag')
3435

@@ -48,6 +49,11 @@ def instance_type(request):
4849
return request.config.getoption('--instance-type')
4950

5051

52+
@pytest.fixture(scope='session')
53+
def accelerator_type(request):
54+
return request.config.getoption('--accelerator-type')
55+
56+
5157
@pytest.fixture(scope='session')
5258
def region(request):
5359
return request.config.getoption('--region')
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License").
4+
# You may not use this file except in compliance with the License.
5+
# A copy of the License is located at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# or in the "license" file accompanying this file. This file is distributed
10+
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11+
# express or implied. See the License for the specific language governing
12+
# permissions and limitations under the License.
13+
14+
import logging
15+
import os
16+
17+
import numpy as np
18+
import pytest
19+
from sagemaker.tensorflow import TensorFlowModel
20+
from sagemaker.utils import sagemaker_timestamp
21+
22+
from test.functional import EI_SUPPORTED_REGIONS
23+
from test.integ.conftest import SCRIPT_PATH
24+
from test.resources.python_sdk.timeout import timeout_and_delete_endpoint_by_name
25+
26+
logger = logging.getLogger(__name__)
27+
logging.getLogger('boto3').setLevel(logging.INFO)
28+
logging.getLogger('botocore').setLevel(logging.INFO)
29+
logging.getLogger('factory.py').setLevel(logging.INFO)
30+
logging.getLogger('auth.py').setLevel(logging.INFO)
31+
logging.getLogger('connectionpool.py').setLevel(logging.INFO)
32+
logging.getLogger('session.py').setLevel(logging.DEBUG)
33+
logging.getLogger('sagemaker').setLevel(logging.DEBUG)
34+
35+
36+
@pytest.fixture(autouse=True)
37+
def skip_if_no_accelerator(accelerator_type):
38+
if accelerator_type is None:
39+
pytest.skip('Skipping because accelerator type was not provided')
40+
41+
42+
@pytest.fixture(autouse=True)
43+
def skip_if_non_supported_ei_region(region):
44+
if region not in EI_SUPPORTED_REGIONS:
45+
pytest.skip('EI is not supported in {}'.format(region))
46+
47+
48+
@pytest.fixture
49+
def pretrained_model_data(region):
50+
return 's3://sagemaker-sample-data-{}/tensorflow/model/resnet/resnet_50_v2_fp32_NCHW.tar.gz'.format(region)
51+
52+
53+
# based on https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_using_elastic_inference_with_your_own_model/tensorflow_pretrained_model_elastic_inference.ipynb
54+
@pytest.mark.skip_if_non_supported_ei_region
55+
@pytest.mark.skip_if_no_accelerator
56+
def test_deploy_elastic_inference_with_pretrained_model(pretrained_model_data, docker_image_uri, sagemaker_session, instance_type, accelerator_type):
57+
resource_path = os.path.join(SCRIPT_PATH, '../resources')
58+
endpoint_name = 'test-tf-ei-deploy-model-{}'.format(sagemaker_timestamp())
59+
60+
with timeout_and_delete_endpoint_by_name(endpoint_name=endpoint_name, sagemaker_session=sagemaker_session,
61+
minutes=20):
62+
tensorflow_model = TensorFlowModel(model_data=pretrained_model_data,
63+
entry_point='default_entry_point.py',
64+
source_dir=resource_path,
65+
role='SageMakerRole',
66+
image=docker_image_uri,
67+
sagemaker_session=sagemaker_session)
68+
69+
logger.info('deploying model to endpoint: {}'.format(endpoint_name))
70+
predictor = tensorflow_model.deploy(initial_instance_count=1,
71+
instance_type=instance_type,
72+
accelerator_type=accelerator_type,
73+
endpoint_name=endpoint_name)
74+
75+
random_input = np.random.rand(1, 1, 3, 3)
76+
77+
predict_response = predictor.predict({'input': random_input.tolist()})
78+
assert predict_response['outputs']['probabilities']

test/integ/conftest.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
2-
#
2+
#
33
# Licensed under the Apache License, Version 2.0 (the "License").
44
# You may not use this file except in compliance with the License.
55
# A copy of the License is located at
6-
#
6+
#
77
# http://www.apache.org/licenses/LICENSE-2.0
8-
#
9-
# or in the "license" file accompanying this file. This file is distributed
10-
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11-
# express or implied. See the License for the specific language governing
8+
#
9+
# or in the "license" file accompanying this file. This file is distributed
10+
# on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
11+
# express or implied. See the License for the specific language governing
1212
# permissions and limitations under the License.
1313

1414
import logging
@@ -37,7 +37,7 @@ def pytest_addoption(parser):
3737
parser.addoption('--tag', required=True)
3838
parser.addoption('--region', default='us-west-2')
3939
parser.addoption('--framework-version', required=True)
40-
parser.addoption('--processor', required=True, choices=['gpu','cpu'])
40+
parser.addoption('--processor', required=True, choices=['gpu', 'cpu'])
4141

4242

4343
@pytest.fixture(scope='session')

test/resources/default_entry_point.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# use default SageMaker defined ``input_fn``, ``predict_fn``, and ``output_fn``

0 commit comments

Comments
 (0)