Merge pull request aws#420 from apacker/master

djarpin · web-flow · commit 330ca29df6d4 · 2018-09-26T21:23:52.000-07:00
Update KMS encryption notebook to include volume encryption
diff --git a/advanced_functionality/handling_kms_encrypted_data/handling_kms_encrypted_data.ipynb b/advanced_functionality/handling_kms_encrypted_data/handling_kms_encrypted_data.ipynb
@@ -5,9 +5,7 @@
    "metadata": {},
    "source": [
     "# SageMaker and AWS KMS–Managed Keys\n",
-    "_**Handling KMS encrypted data with SageMaker model training and encrypting the generated model artifacts**_\n",
-    "\n",
-    "---\n",
+    "_**End-to-end encryption using SageMaker and KMS-Managed keys**_\n",
     "\n",
     "---\n",
     "\n",
@@ -19,12 +17,13 @@
     "1. [Training the XGBoost model](#Training-the-XGBoost-model)\n",
     "1. [Set up hosting for the model](#Set-up-hosting-for-the-model)\n",
     "1. [Validate the model for use](#Validate-the-model-for-use)\n",
+    "1. [Run batch prediction using batch transform](#Run-batch-prediction-using-batch-transform)\n",
     "\n",
     "---\n",
     "## Background\n",
     "\n",
     "AWS Key Management Service ([AWS KMS](http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html)) enables \n",
-    "Server-side encryption to protect your data at rest. Amazon SageMaker training works with KMS encrypted data if the IAM role used for S3 access has permissions to encrypt and decrypt data with the KMS key. Further, a KMS key can also be used to encrypt the model artifacts at rest using Amazon S3 server-side encryption. In this notebook, we demonstrate SageMaker training with KMS encrypted data. \n",
+    "Server-side encryption to protect your data at rest. Amazon SageMaker training works with KMS encrypted data if the IAM role used for S3 access has permissions to encrypt and decrypt data with the KMS key. Further, a KMS key can also be used to encrypt the model artifacts at rest using Amazon S3 server-side encryption. Additionally, a KMS key can also be used to encrypt the storage volume attached to training, endpoint, and transform instances. In this notebook, we demonstrate SageMaker encryption capabilities using KMS-managed keys. \n",
     "\n",
     "---\n",
     "\n",
@@ -36,13 +35,15 @@
     "\n",
     "1. Have an existing KMS key from AWS IAM console or create one ([learn more](http://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html)).\n",
     "2. Allow the IAM role used for SageMaker to encrypt and decrypt data with this key from within applications and when using AWS services integrated with KMS ([learn more](http://docs.aws.amazon.com/console/kms/key-users)).\n",
+    "3. Allow the IAM role for this notebook to create grants with this key ([learn more](https://docs.aws.amazon.com/sagemaker/latest/dg/api-permissions-reference.html)).\n",
     "\n",
     "We use the `key-id` from the KMS key ARN `arn:aws:kms:region:acct-id:key/key-id`.\n",
     "\n",
     "### General Setup\n",
     "Let's start by specifying:\n",
     "* AWS region.\n",
     "* The IAM role arn used to give learning and hosting access to your data. See the documentation for how to specify these.\n",
+    "* The KMS key arn that you want to use for encryption.\n",
     "* The S3 bucket that you want to use for training and model data."
    ]
   },
@@ -68,7 +69,7 @@
     "\n",
     "role = get_execution_role()\n",
     "\n",
-    "kms_key_id = '<your-kms-key-id>'\n",
+    "kms_key_arn = '<your-kms-key-arn>'\n",
     "\n",
     "bucket='<s3-bucket>' # put your s3 bucket name here, and create s3 bucket\n",
     "prefix = 'sagemaker/DEMO-kms'\n",
@@ -174,7 +175,7 @@
     "\n",
     "data_train = open(train_file, 'rb')\n",
     "key_train = '{}/train/{}'.format(prefix,train_file)\n",
-    "\n",
+    "kms_key_id = kms_key_arn.split(':key/')[1]\n",
     "\n",
     "print(\"Put object...\")\n",
     "s3.put_object(Bucket=bucket,\n",
@@ -215,7 +216,7 @@
    "source": [
     "## Training the SageMaker XGBoost model\n",
     "\n",
-    "Now that we have our data in S3, we can begin training. We'll use Amazon SageMaker XGboost algorithm as an example to demonstrate model training. Note that nothing needs to be changed in the way you'd call the training algorithm. The only requirement for training to succeed is that the IAM role (`role`) used for S3 access has permissions to encrypt and decrypt data with the KMS key (`kms_key_id`). You can set these permissions using the instructions [here](http://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html#key-policy-default-allow-users). If the permissions aren't set, you'll get the `Data download failed` error."
+    "Now that we have our data in S3, we can begin training. We'll use Amazon SageMaker XGboost algorithm as an example to demonstrate model training. Note that nothing needs to be changed in the way you'd call the training algorithm. The only requirement for training to succeed is that the IAM role (`role`) used for S3 access has permissions to encrypt and decrypt data with the KMS key (`kms_key_arn`). You can set these permissions using the instructions [here](http://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html#key-policy-default-allow-users). If the permissions aren't set, you'll get the `Data download failed` error. Specify a `VolumeKmsKeyId` in the training job parameters to have the volume attached to the ML compute instance encrypted using key provided."
    ]
   },
   {
@@ -254,7 +255,8 @@
     "    \"ResourceConfig\": {\n",
     "        \"InstanceCount\": 1,\n",
     "        \"InstanceType\": \"ml.m4.4xlarge\",\n",
-    "        \"VolumeSizeInGB\": 5\n",
+    "        \"VolumeSizeInGB\": 5,\n",
+    "        \"VolumeKmsKeyId\": kms_key_arn\n",
     "    },\n",
     "    \"TrainingJobName\": job_name,\n",
     "    \"HyperParameters\": {\n",
@@ -362,7 +364,7 @@
    "source": [
     "### Create endpoint configuration\n",
     "\n",
-    "SageMaker supports configuring REST endpoints in hosting with multiple models, e.g. for A/B testing purposes. In order to support this, customers create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way. In addition, the endpoint configuration describes the instance type required for model deployment."
+    "SageMaker supports configuring REST endpoints in hosting with multiple models, e.g. for A/B testing purposes. In order to support this, customers create an endpoint configuration, that describes the distribution of traffic across the models, whether split, shadowed, or sampled in some way. In addition, the endpoint configuration describes the instance type required for model deployment and the key used to encrypt the volume attached to the endpoint instance."
    ]
   },
   {
@@ -377,6 +379,7 @@
     "print(endpoint_config_name)\n",
     "create_endpoint_config_response = client.create_endpoint_config(\n",
     "    EndpointConfigName = endpoint_config_name,\n",
+    "    KmsKeyId = kms_key_arn,\n",
     "    ProductionVariants=[{\n",
     "        'InstanceType':'ml.m4.xlarge',\n",
     "        'InitialVariantWeight':1,\n",
@@ -499,7 +502,7 @@
    "metadata": {},
    "source": [
     "## Run batch prediction using batch transform\n",
-    "Create a transform job to do batch prediction using the trained model. Similar to the training section above, the execution role assumed by this notebook must have permissions to encrypt and decrypt data with the KMS key (`kms_key_id`) used for S3 server-side encryption."
+    "Create a transform job to do batch prediction using the trained model. Similar to the training section above, the execution role assumed by this notebook must have permissions to encrypt and decrypt data with the KMS key (`kms_key_arn`) used for S3 server-side encryption. Similar to training, specify a `VolumeKmsKeyId` so that the volume attached to the transform instance is encrypted using the key provided."
    ]
   },
   {
@@ -532,7 +535,8 @@
     "    },\n",
     "    \"TransformResources\": {\n",
     "        \"InstanceCount\": 1,\n",
-    "        \"InstanceType\": \"ml.c4.xlarge\"\n",
+    "        \"InstanceType\": \"ml.c4.xlarge\",\n",
+    "        \"VolumeKmsKeyId\": kms_key_arn\n",
     "    }\n",
     "}\n",
     "\n",
@@ -547,7 +551,8 @@
     "        print(\"Transform job completed!\")\n",
     "        break\n",
     "    else:\n",
-    "        print(\"Unexpected transform job status: \" + status)"
+    "        print(\"Unexpected transform job status: \" + status)\n",
+    "        break"
    ]
   },
   {