|
| 1 | + |
| 2 | +=========================== |
| 3 | +SageMaker PyTorch Serving Container |
| 4 | +=========================== |
| 5 | + |
| 6 | +SageMaker PyTorch Serving Container is an open source library for making the |
| 7 | +PyTorch framework run on Amazon SageMaker. |
| 8 | + |
| 9 | +This repository also contains Dockerfiles which install this library, PyTorch, and dependencies |
| 10 | +for building SageMaker PyTorch images. |
| 11 | + |
| 12 | +The SageMaker team uses this repository to build its official PyTorch image. To use this image on SageMaker, |
| 13 | +see `Python SDK <https://github.com/aws/sagemaker-python-sdk>`__. |
| 14 | +For end users, this repository is typically of interest if you need implementation details for |
| 15 | +the official image, or if you want to use it to build your own customized PyTorch image. |
| 16 | + |
| 17 | +For information on running PyTorch jobs on SageMaker: `SageMaker PyTorch Estimators and Models |
| 18 | +<https://github.com/aws/sagemaker-python-sdk/tree/master/src/sagemaker/pytorch>`__. |
| 19 | + |
| 20 | +For notebook examples: `SageMaker Notebook |
| 21 | +Examples <https://github.com/awslabs/amazon-sagemaker-examples>`__. |
| 22 | + |
| 23 | +Table of Contents |
| 24 | +----------------- |
| 25 | + |
| 26 | +#. `Getting Started <#getting-started>`__ |
| 27 | +#. `Building your Image <#building-your-image>`__ |
| 28 | +#. `Running the tests <#running-the-tests>`__ |
| 29 | + |
| 30 | +Getting Started |
| 31 | +--------------- |
| 32 | + |
| 33 | +Prerequisites |
| 34 | +~~~~~~~~~~~~~ |
| 35 | + |
| 36 | +Make sure you have installed all of the following prerequisites on your |
| 37 | +development machine: |
| 38 | + |
| 39 | +- `Docker <https://www.docker.com/>`__ |
| 40 | + |
| 41 | +For Testing on GPU |
| 42 | +^^^^^^^^^^^^^^^^^^ |
| 43 | + |
| 44 | +- `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__ |
| 45 | + |
| 46 | +Recommended |
| 47 | +^^^^^^^^^^^ |
| 48 | + |
| 49 | +- A Python environment management tool (e.g. |
| 50 | + `PyEnv <https://github.com/pyenv/pyenv>`__, |
| 51 | + `VirtualEnv <https://virtualenv.pypa.io/en/stable/>`__) |
| 52 | + |
| 53 | +Building your image |
| 54 | +------------------- |
| 55 | + |
| 56 | +`Amazon SageMaker <https://aws.amazon.com/documentation/sagemaker/>`__ |
| 57 | +utilizes Docker containers to run all training jobs & inference endpoints. |
| 58 | + |
| 59 | +The Docker images are built from the Dockerfiles specified in |
| 60 | +`Docker/ <https://github.com/aws/sagemaker-pytorch-container/tree/master/docker>`__. |
| 61 | + |
| 62 | +The Docker files are grouped based on PyTorch version and separated |
| 63 | +based on Python version and processor type. |
| 64 | + |
| 65 | +The Docker images, used to run training & inference jobs, are built from |
| 66 | +both corresponding "base" and "final" Dockerfiles. |
| 67 | + |
| 68 | +Base Images |
| 69 | +~~~~~~~~~~~ |
| 70 | + |
| 71 | +The "base" Dockerfile encompass the installation of the framework and all of the dependencies |
| 72 | +needed. |
| 73 | + |
| 74 | +Tagging scheme is based on <PyTorch_version>-<processor>-py<python_version>. (e.g.1.0.0-cpu-py3) |
| 75 | + |
| 76 | +All "final" Dockerfiles build images using base images that use the tagging scheme |
| 77 | +above. |
| 78 | + |
| 79 | +If you want to build your base docker image, then use: |
| 80 | + |
| 81 | +:: |
| 82 | + |
| 83 | + # All build instructions assume you're building from the root directory of the sagemaker-pytorch-container. |
| 84 | + |
| 85 | + # CPU |
| 86 | + docker build -t pytorch-base:<PyTorch_version>-cpu-py<python_version> -f docker/<PyTorch_version>/base/Dockerfile.cpu --build-arg py_version=<python_version> . |
| 87 | + |
| 88 | + # GPU |
| 89 | + docker build -t pytorch-base:<PyTorch_version>-gpu-py<python_version> -f docker/<PyTorch_version>/base/Dockerfile.gpu --build-arg py_version=<python_version> . |
| 90 | + |
| 91 | +:: |
| 92 | + |
| 93 | + # Example |
| 94 | + |
| 95 | + # CPU |
| 96 | + docker build -t pytorch-base:1.0.0-cpu-py3 -f docker/1.0.0/base/Dockerfile.cpu --build-arg py_version=3 . |
| 97 | + |
| 98 | + # GPU |
| 99 | + docker build -t pytorch-base:1.0.0-gpu-py3 -f docker/1.0.0/base/Dockerfile.gpu --build-arg py_version=3 . |
| 100 | + |
| 101 | +Final Images |
| 102 | +~~~~~~~~~~~~ |
| 103 | + |
| 104 | +The "final" Dockerfiles encompass the installation of the SageMaker specific support code. |
| 105 | + |
| 106 | +All "final" Dockerfiles use `base images for building <https://github.com/aws/sagemaker-pytorch-container/blob/master/docker/1.0.0/final/Dockerfile.cpu#L2>`__. |
| 107 | + |
| 108 | +These "base" images are specified with the naming convention of |
| 109 | +pytorch-base:<PyTorch_version>-<processor>-py<python_version>. |
| 110 | + |
| 111 | +Before building "final" images: |
| 112 | + |
| 113 | +Build your "base" image. Make sure it is named and tagged in accordance with your "final" |
| 114 | +Dockerfile. |
| 115 | + |
| 116 | + |
| 117 | +:: |
| 118 | + |
| 119 | + # Create the SageMaker PyTorch Serving Container Python package. |
| 120 | + cd sagemaker-pytorch-container |
| 121 | + python setup.py bdist_wheel |
| 122 | + |
| 123 | +If you want to build "final" Docker images, then use: |
| 124 | + |
| 125 | +:: |
| 126 | + |
| 127 | + # All build instructions assume you're building from the root directory of the sagemaker-pytorch-container. |
| 128 | + |
| 129 | + # CPU |
| 130 | + docker build -t <image_name>:<tag> -f docker/<PyTorch_version>/final/Dockerfile.cpu --build-arg py_version=<python_version> . |
| 131 | + |
| 132 | + # GPU |
| 133 | + docker build -t <image_name>:<tag> -f docker/<PyTorch_version>/final/Dockerfile.gpu --build-arg py_version=<python_version> . |
| 134 | + |
| 135 | +:: |
| 136 | + |
| 137 | + # Example |
| 138 | + |
| 139 | + # CPU |
| 140 | + docker build -t preprod-pytorch:1.0.0-cpu-py3 -f docker/1.0.0/final/Dockerfile.cpu --build-arg py_version=3 . |
| 141 | + |
| 142 | + # GPU |
| 143 | + docker build -t preprod-pytorch:1.0.0-gpu-py3 -f docker/1.0.0/final/Dockerfile.gpu --build-arg py_version=3 . |
| 144 | + |
| 145 | + |
| 146 | +Running the tests |
| 147 | +----------------- |
| 148 | + |
| 149 | +Running the tests requires installation of the SageMaker PyTorch Serving Container code and its test |
| 150 | +dependencies. |
| 151 | + |
| 152 | +:: |
| 153 | + |
| 154 | + git clone https://github.com/aws/sagemaker-pytorch-container.git |
| 155 | + cd sagemaker-pytorch-container |
| 156 | + pip install -e .[test] |
| 157 | + |
| 158 | +Tests are defined in |
| 159 | +`test/ <https://github.com/aws/sagemaker-pytorch-container/tree/master/test>`__ |
| 160 | +and include unit, local integration, and SageMaker integration tests. |
| 161 | + |
| 162 | +Unit Tests |
| 163 | +~~~~~~~~~~ |
| 164 | + |
| 165 | +If you want to run unit tests, then use: |
| 166 | + |
| 167 | +:: |
| 168 | + |
| 169 | + # All test instructions should be run from the top level directory |
| 170 | + |
| 171 | + pytest test/unit |
| 172 | + |
| 173 | + # or you can use tox to run unit tests as well as flake8 and code coverage |
| 174 | + |
| 175 | + tox |
| 176 | + |
| 177 | + |
| 178 | +Local Integration Tests |
| 179 | +~~~~~~~~~~~~~~~~~~~~~~~ |
| 180 | + |
| 181 | +Running local integration tests require `Docker <https://www.docker.com/>`__ and `AWS |
| 182 | +credentials <https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/setup-credentials.html>`__, |
| 183 | +as the local integration tests make calls to a couple AWS services. The local integration tests and |
| 184 | +SageMaker integration tests require configurations specified within their respective |
| 185 | +`conftest.py <https://github.com/aws/sagemaker-pytorch-container/blob/master/test/conftest.py>`__. |
| 186 | + |
| 187 | +Local integration tests on GPU require `Nvidia-Docker <https://github.com/NVIDIA/nvidia-docker>`__. |
| 188 | + |
| 189 | +Before running local integration tests: |
| 190 | + |
| 191 | +#. Build your Docker image. |
| 192 | +#. Pass in the correct pytest arguments to run tests against your Docker image. |
| 193 | + |
| 194 | +If you want to run local integration tests, then use: |
| 195 | + |
| 196 | +:: |
| 197 | + |
| 198 | + # Required arguments for integration tests are found in test/conftest.py |
| 199 | + |
| 200 | + pytest test/integration/local --docker-base-name <your_docker_image> \ |
| 201 | + --tag <your_docker_image_tag> \ |
| 202 | + --py-version <2_or_3> \ |
| 203 | + --framework-version <PyTorch_version> \ |
| 204 | + --processor <cpu_or_gpu> |
| 205 | + |
| 206 | +:: |
| 207 | + |
| 208 | + # Example |
| 209 | + pytest test/integration/local --docker-base-name preprod-pytorch \ |
| 210 | + --tag 1.0 \ |
| 211 | + --py-version 3 \ |
| 212 | + --framework-version 1.0.0 \ |
| 213 | + --processor cpu |
| 214 | + |
| 215 | +SageMaker Integration Tests |
| 216 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 217 | + |
| 218 | +SageMaker integration tests require your Docker image to be within an `Amazon ECR repository <https://docs |
| 219 | +.aws.amazon.com/AmazonECS/latest/developerguide/ECS_Console_Repositories.html>`__. |
| 220 | + |
| 221 | +The Docker base name is your `ECR repository namespace <https://docs.aws.amazon |
| 222 | +.com/AmazonECR/latest/userguide/Repositories.html>`__. |
| 223 | + |
| 224 | +The instance type is your specified `Amazon SageMaker Instance Type |
| 225 | +<https://aws.amazon.com/sagemaker/pricing/instance-types/>`__ that the SageMaker integration test will run on. |
| 226 | + |
| 227 | +Before running SageMaker integration tests: |
| 228 | + |
| 229 | +#. Build your Docker image. |
| 230 | +#. Push the image to your ECR repository. |
| 231 | +#. Pass in the correct pytest arguments to run tests on SageMaker against the image within your ECR repository. |
| 232 | + |
| 233 | +If you want to run a SageMaker integration end to end test on `Amazon |
| 234 | +SageMaker <https://aws.amazon.com/sagemaker/>`__, then use: |
| 235 | + |
| 236 | +:: |
| 237 | + |
| 238 | + # Required arguments for integration tests are found in test/conftest.py |
| 239 | + |
| 240 | + pytest test/integration/sagemaker --aws-id <your_aws_id> \ |
| 241 | + --docker-base-name <your_docker_image> \ |
| 242 | + --instance-type <amazon_sagemaker_instance_type> \ |
| 243 | + --tag <your_docker_image_tag> \ |
| 244 | + |
| 245 | +:: |
| 246 | + |
| 247 | + # Example |
| 248 | + pytest test/integration/sagemaker --aws-id 12345678910 \ |
| 249 | + --docker-base-name preprod-pytorch \ |
| 250 | + --instance-type ml.m4.xlarge \ |
| 251 | + --tag 1.0 |
| 252 | + |
| 253 | +Contributing |
| 254 | +------------ |
| 255 | + |
| 256 | +Please read |
| 257 | +`CONTRIBUTING.md <https://github.com/aws/sagemaker-pytorch-container/blob/master/CONTRIBUTING.md>`__ |
| 258 | +for details on our code of conduct, and the process for submitting pull |
| 259 | +requests to us. |
| 260 | + |
| 261 | +License |
| 262 | +------- |
| 263 | + |
| 264 | +SageMaker PyTorch Serving Container is licensed under the Apache 2.0 License. It is copyright 2018 Amazon |
| 265 | +.com, Inc. or its affiliates. All Rights Reserved. The license is available at: |
| 266 | +http://aws.amazon.com/apache2.0/ |
0 commit comments