Skip to content

Commit ddf7475

Browse files
mvsuspnikhil-sk
authored andcommitted
Split serving container (#1)
* split serving container [~ 20 commits]
1 parent 562d21a commit ddf7475

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+2371
-7
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.idea/

CHANGELOG.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Changelog
2+
3+
## v1.0.5 (2019-08-05)
4+
5+
### Bug fixes and other changes
6+
7+
* upgrade sagemaker-container version
8+
* unmark 2 deploy tests
9+
* update p2 restricted regions
10+
11+
## v1.0.6 (2019-06-21)
12+
13+
### Bug fixes and other changes
14+
15+
* unmark 2 deploy tests
16+
17+
## v1.0.5 (2019-06-20)
18+
19+
### Bug fixes and other changes
20+
21+
* update p2 restricted regions
22+
23+
## v1.0.4 (2019-06-19)
24+
25+
### Bug fixes and other changes
26+
27+
* skip tests in gpu instance restricted regions
28+
29+
## v1.0.3 (2019-06-18)
30+
31+
### Bug fixes and other changes
32+
33+
* modify buildspecs and tox files
34+
35+
## v1.0.2 (2019-06-17)
36+
37+
### Bug fixes and other changes
38+
39+
* freeze dependency versions
40+
41+
## v1.0.1 (2019-06-13)
42+
43+
### Bug fixes and other changes
44+
45+
* add buildspec-release file and upgrade cuda version
46+
* upgrade PyTorch to 1.1
47+
* disable test_mnist_gpu for py2 for now
48+
* fix broken line of buildspec
49+
* prevent hidden errors in buildspec
50+
* Add AWS CodeBuild buildspec for pull request
51+
* Bump minimum SageMaker Containers version to 2.4.6 and pin SageMaker Python SDK to 1.18.16
52+
* fix broken link in README
53+
* Add timeout to test_mnist_gpu test
54+
* Use dummy role in tests and update local failure integ test
55+
* Use the SageMaker Python SDK for local serving integ tests
56+
* Use the SageMaker Python SDK for local integ distributed training tests
57+
* Use the SageMaker Python SDK for local integ single-machine training tests
58+
* Pin fastai version to 1.0.39 in CPU dockerfile
59+
* Use the SageMaker Python SDK for SageMaker integration tests
60+
* Add missing rendering dependencies for opencv and a simple test.
61+
* Add opencv support.
62+
* Freeze PyYAML version to avoid conflict with Docker Compose
63+
* Unfreeze numpy version.
64+
* Freeze TorchVision to 0.2.1
65+
* Specify region when creating S3 resource in integ tests
66+
* Read framework version from Python SDK for integ test default
67+
* Fix unicode display problem in py2 container
68+
* freeze pip <=18.1, fastai == 1.0.39, numpy <= 1.15.4
69+
* Add support for fastai (https://github.com/fastai/fastai) library.
70+
* Remove "requsests" from tests dependencies to avoid regular conflicts with "requests" package from "sagemaker" dependencies.
71+
* Add support for PyTorch-1.0.

README.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

README.rst

Lines changed: 266 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,266 @@
1+
2+
===========================
3+
SageMaker PyTorch Serving Container
4+
===========================
5+
6+
SageMaker PyTorch Serving Container is an open source library for making the
7+
PyTorch framework run on Amazon SageMaker.
8+
9+
This repository also contains Dockerfiles which install this library, PyTorch, and dependencies
10+
for building SageMaker PyTorch images.
11+
12+
The SageMaker team uses this repository to build its official PyTorch image. To use this image on SageMaker,
13+
see `Python SDK <https://github.com/aws/sagemaker-python-sdk>`__.
14+
For end users, this repository is typically of interest if you need implementation details for
15+
the official image, or if you want to use it to build your own customized PyTorch image.
16+
17+
For information on running PyTorch jobs on SageMaker: `SageMaker PyTorch Estimators and Models
18+
<https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/pytorch>`__.
19+
20+
For notebook examples: `SageMaker Notebook
21+
Examples <https://github.com/awslabs/amazon-sagemaker-examples>`__.
22+
23+
Table of Contents
24+
-----------------
25+
26+
#. `Getting Started <#getting-started>`__
27+
#. `Building your Image <#building-your-image>`__
28+
#. `Running the tests <#running-the-tests>`__
29+
30+
Getting Started
31+
---------------
32+
33+
Prerequisites
34+
~~~~~~~~~~~~~
35+
36+
Make sure you have installed all of the following prerequisites on your
37+
development machine:
38+
39+
- `Docker <https://www.docker.com/>`__
40+
41+
For Testing on GPU
42+
^^^^^^^^^^^^^^^^^^
43+
44+
- `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__
45+
46+
Recommended
47+
^^^^^^^^^^^
48+
49+
- A Python environment management tool (e.g.
50+
`PyEnv <https://github.com/pyenv/pyenv>`__,
51+
`VirtualEnv <https://virtualenv.pypa.io/en/stable/>`__)
52+
53+
Building your image
54+
-------------------
55+
56+
`Amazon SageMaker <https://aws.amazon.com/documentation/sagemaker/>`__
57+
utilizes Docker containers to run all training jobs & inference endpoints.
58+
59+
The Docker images are built from the Dockerfiles specified in
60+
`Docker/ <https://github.com/aws/sagemaker-pytorch-container/tree/master/docker>`__.
61+
62+
The Docker files are grouped based on PyTorch version and separated
63+
based on Python version and processor type.
64+
65+
The Docker images, used to run training & inference jobs, are built from
66+
both corresponding "base" and "final" Dockerfiles.
67+
68+
Base Images
69+
~~~~~~~~~~~
70+
71+
The "base" Dockerfile encompass the installation of the framework and all of the dependencies
72+
needed.
73+
74+
Tagging scheme is based on <PyTorch_version>-<processor>-py<python_version>. (e.g.1.0.0-cpu-py3)
75+
76+
All "final" Dockerfiles build images using base images that use the tagging scheme
77+
above.
78+
79+
If you want to build your base docker image, then use:
80+
81+
::
82+
83+
# All build instructions assume you're building from the root directory of the sagemaker-pytorch-container.
84+
85+
# CPU
86+
docker build -t pytorch-base:<PyTorch_version>-cpu-py<python_version> -f docker/<PyTorch_version>/base/Dockerfile.cpu --build-arg py_version=<python_version> .
87+
88+
# GPU
89+
docker build -t pytorch-base:<PyTorch_version>-gpu-py<python_version> -f docker/<PyTorch_version>/base/Dockerfile.gpu --build-arg py_version=<python_version> .
90+
91+
::
92+
93+
# Example
94+
95+
# CPU
96+
docker build -t pytorch-base:1.0.0-cpu-py3 -f docker/1.0.0/base/Dockerfile.cpu --build-arg py_version=3 .
97+
98+
# GPU
99+
docker build -t pytorch-base:1.0.0-gpu-py3 -f docker/1.0.0/base/Dockerfile.gpu --build-arg py_version=3 .
100+
101+
Final Images
102+
~~~~~~~~~~~~
103+
104+
The "final" Dockerfiles encompass the installation of the SageMaker specific support code.
105+
106+
All "final" Dockerfiles use `base images for building <https://github.com/aws/sagemaker-pytorch-container/blob/master/docker/1.0.0/final/Dockerfile.cpu#L2>`__.
107+
108+
These "base" images are specified with the naming convention of
109+
pytorch-base:<PyTorch_version>-<processor>-py<python_version>.
110+
111+
Before building "final" images:
112+
113+
Build your "base" image. Make sure it is named and tagged in accordance with your "final"
114+
Dockerfile.
115+
116+
117+
::
118+
119+
# Create the SageMaker PyTorch Serving Container Python package.
120+
cd sagemaker-pytorch-container
121+
python setup.py bdist_wheel
122+
123+
If you want to build "final" Docker images, then use:
124+
125+
::
126+
127+
# All build instructions assume you're building from the root directory of the sagemaker-pytorch-container.
128+
129+
# CPU
130+
docker build -t <image_name>:<tag> -f docker/<PyTorch_version>/final/Dockerfile.cpu --build-arg py_version=<python_version> .
131+
132+
# GPU
133+
docker build -t <image_name>:<tag> -f docker/<PyTorch_version>/final/Dockerfile.gpu --build-arg py_version=<python_version> .
134+
135+
::
136+
137+
# Example
138+
139+
# CPU
140+
docker build -t preprod-pytorch:1.0.0-cpu-py3 -f docker/1.0.0/final/Dockerfile.cpu --build-arg py_version=3 .
141+
142+
# GPU
143+
docker build -t preprod-pytorch:1.0.0-gpu-py3 -f docker/1.0.0/final/Dockerfile.gpu --build-arg py_version=3 .
144+
145+
146+
Running the tests
147+
-----------------
148+
149+
Running the tests requires installation of the SageMaker PyTorch Serving Container code and its test
150+
dependencies.
151+
152+
::
153+
154+
git clone https://github.com/aws/sagemaker-pytorch-container.git
155+
cd sagemaker-pytorch-container
156+
pip install -e .[test]
157+
158+
Tests are defined in
159+
`test/ <https://github.com/aws/sagemaker-pytorch-container/tree/master/test>`__
160+
and include unit, local integration, and SageMaker integration tests.
161+
162+
Unit Tests
163+
~~~~~~~~~~
164+
165+
If you want to run unit tests, then use:
166+
167+
::
168+
169+
# All test instructions should be run from the top level directory
170+
171+
pytest test/unit
172+
173+
# or you can use tox to run unit tests as well as flake8 and code coverage
174+
175+
tox
176+
177+
178+
Local Integration Tests
179+
~~~~~~~~~~~~~~~~~~~~~~~
180+
181+
Running local integration tests require `Docker <https://www.docker.com/>`__ and `AWS
182+
credentials <https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html>`__,
183+
as the local integration tests make calls to a couple AWS services. The local integration tests and
184+
SageMaker integration tests require configurations specified within their respective
185+
`conftest.py <https://github.com/aws/sagemaker-pytorch-container/blob/master/test/conftest.py>`__.
186+
187+
Local integration tests on GPU require `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__.
188+
189+
Before running local integration tests:
190+
191+
#. Build your Docker image.
192+
#. Pass in the correct pytest arguments to run tests against your Docker image.
193+
194+
If you want to run local integration tests, then use:
195+
196+
::
197+
198+
# Required arguments for integration tests are found in test/conftest.py
199+
200+
pytest test/integration/local --docker-base-name <your_docker_image> \
201+
--tag <your_docker_image_tag> \
202+
--py-version <2_or_3> \
203+
--framework-version <PyTorch_version> \
204+
--processor <cpu_or_gpu>
205+
206+
::
207+
208+
# Example
209+
pytest test/integration/local --docker-base-name preprod-pytorch \
210+
--tag 1.0 \
211+
--py-version 3 \
212+
--framework-version 1.0.0 \
213+
--processor cpu
214+
215+
SageMaker Integration Tests
216+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
217+
218+
SageMaker integration tests require your Docker image to be within an `Amazon ECR repository <https://docs
219+
.aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html>`__.
220+
221+
The Docker base name is your `ECR repository namespace <https://docs.aws.amazon
222+
.com/AmazonECR/latest/userguide/Repositories.html>`__.
223+
224+
The instance type is your specified `Amazon SageMaker Instance Type
225+
<https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ that the SageMaker integration test will run on.
226+
227+
Before running SageMaker integration tests:
228+
229+
#. Build your Docker image.
230+
#. Push the image to your ECR repository.
231+
#. Pass in the correct pytest arguments to run tests on SageMaker against the image within your ECR repository.
232+
233+
If you want to run a SageMaker integration end to end test on `Amazon
234+
SageMaker <https://aws.amazon.com/sagemaker/>`__, then use:
235+
236+
::
237+
238+
# Required arguments for integration tests are found in test/conftest.py
239+
240+
pytest test/integration/sagemaker --aws-id <your_aws_id> \
241+
--docker-base-name <your_docker_image> \
242+
--instance-type <amazon_sagemaker_instance_type> \
243+
--tag <your_docker_image_tag> \
244+
245+
::
246+
247+
# Example
248+
pytest test/integration/sagemaker --aws-id 12345678910 \
249+
--docker-base-name preprod-pytorch \
250+
--instance-type ml.m4.xlarge \
251+
--tag 1.0
252+
253+
Contributing
254+
------------
255+
256+
Please read
257+
`CONTRIBUTING.md <https://github.com/aws/sagemaker-pytorch-container/blob/master/CONTRIBUTING.md>`__
258+
for details on our code of conduct, and the process for submitting pull
259+
requests to us.
260+
261+
License
262+
-------
263+
264+
SageMaker PyTorch Serving Container is licensed under the Apache 2.0 License. It is copyright 2018 Amazon
265+
.com, Inc. or its affiliates. All Rights Reserved. The license is available at:
266+
http://aws.amazon.com/apache2.0/

VERSION

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
1.0.6.dev0

0 commit comments

Comments
 (0)