Skip to content

Commit 2206a21

Browse files
author
Dan Choi
committed
Add disclaimer and local gpu
1 parent 960f0ef commit 2206a21

File tree

1 file changed

+26
-10
lines changed

1 file changed

+26
-10
lines changed

advanced_functionality/pytorch_extending_our_containers/pytorch_extending_our_containers.ipynb

Lines changed: 26 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -46,14 +46,13 @@
4646
"\n",
4747
"Even if there is direct SDK support for your environment or framework, you may want to add additional functionality or configure your container environment differently while utilizing our container to use on SageMaker.\n",
4848
"\n",
49-
"All of our deep learning framework containers are open source on GitHub. Based on your use case it you can easily modify the container code and build a modified version of our containers.\n",
49+
"**Some of the reasons to extend a SageMaker deep learning framework container are:**\n",
50+
"1. Install additional dependencies, while utilizing the same training and hosting solution.\n",
51+
"2. Configure your environment.\n",
5052
"\n",
51-
"* [SageMaker TensorFlow Container](https://github.com/aws/sagemaker-tensorflow-container)\n",
52-
"* [SageMaker MXNet Container](https://github.com/aws/sagemaker-mxnet-container)\n",
53-
"* [SageMaker PyTorch Container](https://github.com/aws/sagemaker-pytorch-container) \n",
54-
"* [SageMaker Chainer Container](https://github.com/aws/sagemaker-chainer-container)\n",
53+
"**Although it is possible to extend any of our framework containers as a parent image, the example this notebook covers is currently only intended to work with our PyTorch (0.4.0+) and Chainer (4.1.0+) containers.**\n",
5554
"\n",
56-
"This walkthrough shows that it is quite straightforward to extend one of our containers to build your own custom container.\n",
55+
"This walkthrough shows that it is quite straightforward to extend one of our containers to build your own custom container for PyTorch or Chainer.\n",
5756
"\n",
5857
"## Permissions\n",
5958
"\n",
@@ -84,7 +83,7 @@
8483
"\n",
8584
"If you're familiar with Docker already, you can skip ahead to the next section.\n",
8685
"\n",
87-
"For many data scientists, Docker containers are a new technology. But they are not difficult and can significantly simply the deployment of your software packages. \n",
86+
"For many data scientists, Docker containers are a new technology. But they are not difficult and can significantly simplify the deployment of your software packages. \n",
8887
"\n",
8988
"Docker provides a simple way to package arbitrary code into an _image_ that is totally self-contained. Once you have an image, you can use Docker to run a _container_ based on that image. Running a container is just like running a program on the machine except that the container creates a fully self-contained environment for the program to run. Containers are isolated from each other and from the host environment, so the way your program is set up is the way it runs, no matter where you run it.\n",
9089
"\n",
@@ -250,6 +249,7 @@
250249
"\r\n",
251250
"# For more information on creating a Dockerfile\r\n",
252251
"# https://docs.docker.com/compose/gettingstarted/#step-2-create-a-dockerfile\r\n",
252+
"# https://github.com/awslabs/amazon-sagemaker-examples/master/advanced_functionality/pytorch_extending_our_containers/pytorch_extending_our_containers.ipynb\r\n",
253253
"# SageMaker PyTorch image\r\n",
254254
"FROM 520713654638.dkr.ecr.us-west-2.amazonaws.com/sagemaker-pytorch:0.4.0-cpu-py3\r\n",
255255
"\r\n",
@@ -375,7 +375,7 @@
375375
"To represent our training, we use the Estimator class, which needs to be configured in five steps. \n",
376376
"1. IAM role - our AWS execution role\n",
377377
"2. train_instance_count - number of instances to use for training.\n",
378-
"3. train_instance_type - type of instance to use for training. For training locally, we specify `local`.\n",
378+
"3. train_instance_type - type of instance to use for training. For training locally, we specify `local` or `local_gpu`.\n",
379379
"4. image_name - our custom PyTorch Docker image we created.\n",
380380
"5. hyperparameters - hyperparameters we want to pass.\n",
381381
"\n",
@@ -420,6 +420,24 @@
420420
"!/bin/bash ./utils/setup.sh"
421421
]
422422
},
423+
{
424+
"cell_type": "code",
425+
"execution_count": null,
426+
"metadata": {},
427+
"outputs": [],
428+
"source": [
429+
"import os\n",
430+
"import subprocess\n",
431+
"\n",
432+
"instance_type = 'local'\n",
433+
"\n",
434+
"if subprocess.call('nvidia-smi') == 0:\n",
435+
" ## Set type to GPU if one is present\n",
436+
" instance_type = 'local_gpu'\n",
437+
" \n",
438+
"print(\"Instance type = \" + instance_type)"
439+
]
440+
},
423441
{
424442
"cell_type": "code",
425443
"execution_count": null,
@@ -430,8 +448,6 @@
430448
"\n",
431449
"hyperparameters = {'epochs': 1}\n",
432450
"\n",
433-
"instance_type = 'local'\n",
434-
"\n",
435451
"estimator = Estimator(role=role,\n",
436452
" train_instance_count=1,\n",
437453
" train_instance_type=instance_type,\n",

0 commit comments

Comments
 (0)