|
| 1 | +Amazon SageMaker Components for Kubeflow Pipelines |
| 2 | +================================================== |
| 3 | + |
| 4 | +This document outlines how to use Amazon SageMaker Components |
| 5 | +for Kubeflow Pipelines (KFP). With these pipeline components, you can |
| 6 | +create and monitor training, tuning, endpoint deployment, and batch |
| 7 | +transform jobs in Amazon SageMaker. By running Kubeflow Pipeline jobs on |
| 8 | +Amazon SageMaker, you move data processing and training jobs from the |
| 9 | +Kubernetes cluster to Amazon SageMaker’s machine learning-optimized |
| 10 | +managed service. This document assumes prior knowledge of Kubernetes and |
| 11 | +Kubeflow. |
| 12 | + |
| 13 | +What is Kubeflow Pipelines? |
| 14 | +--------------------------- |
| 15 | + |
| 16 | +Kubeflow Pipelines (KFP) is a platform for building and deploying |
| 17 | +portable, scalable machine learning (ML) workflows based on Docker |
| 18 | +containers. The Kubeflow Pipelines platform consists of the following: |
| 19 | + |
| 20 | +- A user interface (UI) for managing and tracking experiments, jobs, |
| 21 | + and runs. |
| 22 | + |
| 23 | +- An engine (Argo) for scheduling multi-step ML workflows. |
| 24 | + |
| 25 | +- A Python SDK for defining and manipulating pipelines and components. |
| 26 | + |
| 27 | +- Notebooks for interacting with the system using the SDK. |
| 28 | + |
| 29 | +A pipeline is a description of an ML workflow expressed as a directed |
| 30 | +acyclic \ `graph <https://www.kubeflow.org/docs/pipelines/concepts/graph/>`__ |
| 31 | +as shown in the following diagram. Every step in the workflow is |
| 32 | +expressed as a Kubeflow Pipeline |
| 33 | +`component <https://www.kubeflow.org/docs/pipelines/overview/concepts/component/>`__, |
| 34 | +which is a Python module. |
| 35 | + |
| 36 | +If your data has been preprocessed, the standard pipeline takes a subset |
| 37 | +of the data and runs hyperparameter optimization of the model. The |
| 38 | +pipeline then trains a model with the full dataset using the optimal |
| 39 | +hyperparameters. This model is used for both batch inference and |
| 40 | +endpoint creation. |
| 41 | + |
| 42 | +For more information on Kubeflow Pipelines, see the \ `Kubeflow |
| 43 | +Pipelines documentation <https://www.kubeflow.org/docs/pipelines/>`__. |
| 44 | + |
| 45 | +Kubeflow Pipeline components |
| 46 | +---------------------------- |
| 47 | + |
| 48 | +A Kubeflow Pipeline component is a set of code used to execute one step |
| 49 | +in a Kubeflow pipeline. Components are represented by a Python module |
| 50 | +that is converted into a Docker image. These components make it fast and |
| 51 | +easy to write pipelines for experimentation and production environments |
| 52 | +without having to interact with the underlying Kubernetes |
| 53 | +infrastructure. |
| 54 | + |
| 55 | +What do Amazon SageMaker Components for Kubeflow Pipelines provide? |
| 56 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 57 | +Amazon SageMaker Components for Kubeflow Pipelines offer an alternative |
| 58 | +to launching compute-intensive jobs in Amazon SageMaker. These |
| 59 | +components integrate Amazon SageMaker with the portability and |
| 60 | +orchestration of Kubeflow Pipelines. Using the Amazon SageMaker |
| 61 | +components, each of the jobs in the pipeline workflow runs on Amazon |
| 62 | +SageMaker instead of the local Kubernetes cluster. The job parameters, |
| 63 | +status, logs, and outputs from Amazon SageMaker are still accessible |
| 64 | +from the Kubeflow Pipelines UI. The following Amazon SageMaker |
| 65 | +components have been created to integrate 6 key Amazon SageMaker |
| 66 | +features into your ML workflows. You can create a Kubeflow Pipeline |
| 67 | +built entirely using these components, or integrate individual |
| 68 | +components into your workflow as needed. |
| 69 | + |
| 70 | +There is no additional charge for using Amazon SageMaker Components for |
| 71 | +Kubeflow Pipelines. You incur charges for any Amazon SageMaker resources |
| 72 | +you use through these components. |
| 73 | + |
| 74 | +Training components |
| 75 | +^^^^^^^^^^^^^^^^^^^ |
| 76 | + |
| 77 | +**Training** |
| 78 | + |
| 79 | +The Training component allows you to submit Amazon SageMaker Training |
| 80 | +jobs directly from a Kubeflow Pipelines workflow. For more information, |
| 81 | +see \ `SageMaker Training Kubeflow Pipelines |
| 82 | +component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/train>`__. |
| 83 | + |
| 84 | +**Hyperparameter Optimization** |
| 85 | + |
| 86 | +The Hyperparameter Optimization component enables you to submit |
| 87 | +hyperparameter tuning jobs to Amazon SageMaker directly from a Kubeflow |
| 88 | +Pipelines workflow. For more information, see \ `SageMaker |
| 89 | +hyperparameter optimization Kubeflow Pipeline |
| 90 | +component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/hyperparameter_tuning>`__. |
| 91 | + |
| 92 | +Inference components |
| 93 | +^^^^^^^^^^^^^^^^^^^^ |
| 94 | + |
| 95 | +**Hosting Deploy** |
| 96 | + |
| 97 | +The Deploy component enables you to deploy a model in Amazon SageMaker |
| 98 | +Hosting from a Kubeflow Pipelines workflow. For more information, |
| 99 | +see \ `SageMaker Hosting Services - Create Endpoint Kubeflow Pipeline |
| 100 | +component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/deploy>`__. |
| 101 | + |
| 102 | +**Batch Transform component** |
| 103 | + |
| 104 | +The Batch Transform component enables you to run inference jobs for an |
| 105 | +entire dataset in Amazon SageMaker from a Kubeflow Pipelines workflow. |
| 106 | +For more information, see \ `SageMaker Batch Transform Kubeflow Pipeline |
| 107 | +component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/batch_transform>`__. |
| 108 | + |
| 109 | +Ground Truth components |
| 110 | +^^^^^^^^^^^^^^^^^^^^^^^ |
| 111 | + |
| 112 | +**Ground Truth**\ |
| 113 | + |
| 114 | +The Ground Truth component enables you to to submit Amazon SageMaker |
| 115 | +Ground Truth labeling jobs directly from a Kubeflow Pipelines workflow. |
| 116 | +For more information, see \ `SageMaker Ground Truth Kubeflow Pipelines |
| 117 | +component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/ground_truth>`__. |
| 118 | + |
| 119 | +**Workteam** |
| 120 | + |
| 121 | +The Workteam component enables you to create Amazon SageMaker private |
| 122 | +workteam jobs directly from a Kubeflow Pipelines workflow. For more |
| 123 | +information, see \ `SageMaker create private workteam Kubeflow Pipelines |
| 124 | +component <https://github.com/kubeflow/pipelines/tree/master/components/aws/sagemaker/workteam>`__. |
| 125 | + |
| 126 | +IAM permissions |
| 127 | +--------------- |
| 128 | + |
| 129 | +Deploying Kubeflow Pipelines with Amazon SageMaker components requires |
| 130 | +the following three levels of IAM permissions: |
| 131 | + |
| 132 | +- An IAM user/role to access your AWS account (**your\_credentials**). |
| 133 | + Note: You don’t need this at all if you already have access to KFP |
| 134 | + web UI and have your input data in Amazon S3, or if you already have |
| 135 | + an Amazon Elastic Kubernetes Service (Amazon EKS) cluster with KFP. |
| 136 | + |
| 137 | + You use this user/role from your gateway node, which can be your |
| 138 | + local machine or a remote instance, to: |
| 139 | + |
| 140 | + - Create an Amazon EKS cluster and install KFP |
| 141 | + |
| 142 | + - Create IAM roles/users |
| 143 | + |
| 144 | + - Create S3 buckets for your sample input data |
| 145 | + |
| 146 | + The IAM user/role needs the following permissions: |
| 147 | + |
| 148 | + - CloudWatchLogsFullAccess |
| 149 | + |
| 150 | + - `AWSCloudFormationFullAccess <https://console.aws.amazon.com/iam/home?region=us-east-1#/policies/arn%3Aaws%3Aiam%3A%3Aaws%3Apolicy%2FAWSCloudFormationFullAccess>`__ |
| 151 | + |
| 152 | + - IAMFullAccess |
| 153 | + |
| 154 | + - AmazonS3FullAccess |
| 155 | + |
| 156 | + - AmazonEC2FullAccess |
| 157 | + |
| 158 | + - AmazonEKSAdminPolicy - Create this policy using the schema |
| 159 | + from \ `Amazon EKS Identity-Based Policy |
| 160 | + Examples <https://docs.aws.amazon.com/eks/latest/userguide/security_iam_id-based-policy-examples.html>`__ |
| 161 | + |
| 162 | +- An IAM role used by KFP pods to access Amazon Sagemaker |
| 163 | + (**kfp-example-pod-role**) The KFP pods use this permission to create |
| 164 | + Amazon SageMaker jobs from KFP components. Note: If you want to limit |
| 165 | + permissions to the KFP pods, create your own custom policy and attach |
| 166 | + it. |
| 167 | + |
| 168 | + The role needs the following permission: |
| 169 | + |
| 170 | + - AmazonSageMakerFullAccess |
| 171 | + |
| 172 | +- An IAM role used by SageMaker jobs to access resources such as Amazon |
| 173 | + S3, ECR etc. (**kfp-example-sagemaker-execution-role**). |
| 174 | + |
| 175 | + Your Amazon SageMaker jobs use this role to: |
| 176 | + |
| 177 | + - Access Amazon Sagemaker resources |
| 178 | + |
| 179 | + - Input Data from S3 |
| 180 | + |
| 181 | + - Store your output model to S3 |
| 182 | + |
| 183 | + The role needs the following permissions: |
| 184 | + |
| 185 | + - AmazonSageMakerFullAccess |
| 186 | + |
| 187 | + - AmazonS3FullAccess |
| 188 | + |
| 189 | +These are all the IAM users/roles you need to run KFP components for |
| 190 | +Amazon SageMaker. |
| 191 | + |
| 192 | +When you have run the components and have created the Amazon SageMaker |
| 193 | +endpoint, you also need a role with the ``sagemaker:InvokeEndpoint`` |
| 194 | +permission to query inference endpoints. |
| 195 | + |
| 196 | +Converting Pipelines to use Amazon SageMaker |
| 197 | +-------------------------------------------- |
| 198 | + |
| 199 | +You can convert an existing pipeline to use Amazon SageMaker by porting |
| 200 | +your generic Python `processing |
| 201 | +containers <https://docs.aws.amazon.com/sagemaker/latest/dg/amazon-sagemaker-containers.html>`__ |
| 202 | +and \ `training |
| 203 | +containers <https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-training-algo.html>`__. |
| 204 | +If you are using Amazon SageMaker for inference, you also need to attach |
| 205 | +IAM permissions to your cluster and convert an artifact to a model. |
0 commit comments