From c35eab27aa14dd55653f465b32cfa711668680d9 Mon Sep 17 00:00:00 2001 From: Slesar Date: Mon, 11 Mar 2019 15:11:14 -0700 Subject: [PATCH 1/6] remove duplicate content from main, tf, and sklearn readmes and add links to readthedocs --- CHANGELOG.rst | 4 + README.rst | 721 +--------------------------- src/sagemaker/sklearn/README.rst | 585 +--------------------- src/sagemaker/tensorflow/README.rst | 428 +---------------- 4 files changed, 21 insertions(+), 1717 deletions(-) diff --git a/CHANGELOG.rst b/CHANGELOG.rst index a80788661a..9db07a57b6 100644 --- a/CHANGELOG.rst +++ b/CHANGELOG.rst @@ -2,6 +2,10 @@ CHANGELOG ========= +1.18.5dev +========= + +* doc-fix: Remove duplicate content from main README.rst, /tensorflow/README.rst, and /sklearn/README.rst and add links to readthedocs content 1.18.4 ====== diff --git a/README.rst b/README.rst index 91d8bda716..cb761d069b 100644 --- a/README.rst +++ b/README.rst @@ -27,24 +27,24 @@ Table of Contents ----------------- 1. `Installing SageMaker Python SDK <#installing-the-sagemaker-python-sdk>`__ -2. `SageMaker Python SDK Overview <#sagemaker-python-sdk-overview>`__ +2. `Using the SageMaker Python SDK `__ 3. `MXNet SageMaker Estimators <#mxnet-sagemaker-estimators>`__ 4. `TensorFlow SageMaker Estimators <#tensorflow-sagemaker-estimators>`__ -5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators>`__ +5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators <`__ 6. `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__ 7. `Scikit-learn SageMaker Estimators <#scikit-learn-sagemaker-estimators>`__ 8. `SageMaker Reinforcement Learning Estimators <#sagemaker-reinforcement-learning-estimators>`__ 9. `SageMaker SparkML Serving <#sagemaker-sparkml-serving>`__ 10. `AWS SageMaker Estimators <#aws-sagemaker-estimators>`__ -11. `Using SageMaker AlgorithmEstimators <#using-sagemaker-algorithmestimators>`__ -12. `Consuming SageMaker Model Packages <#consuming-sagemaker-model-packages>`__ -13. `BYO Docker Containers with SageMaker Estimators <#byo-docker-containers-with-sagemaker-estimators>`__ -14. `SageMaker Automatic Model Tuning <#sagemaker-automatic-model-tuning>`__ -15. `SageMaker Batch Transform <#sagemaker-batch-transform>`__ -16. `Secure Training and Inference with VPC <#secure-training-and-inference-with-vpc>`__ -17. `BYO Model <#byo-model>`__ -18. `Inference Pipelines <#inference-pipelines>`__ -19. `SageMaker Workflow <#sagemaker-workflow>`__ +11. `Using SageMaker AlgorithmEstimators `__ +12. `Consuming SageMaker Model Packages `__ +13. `BYO Docker Containers with SageMaker Estimators `__ +14. `SageMaker Automatic Model Tuning `__ +15. `SageMaker Batch Transform `__ +16. `Secure Training and Inference with VPC `__ +17. `BYO Model `__ +18. `Inference Pipelines `__ +19. `SageMaker Workflow `__ Installing the SageMaker Python SDK @@ -151,287 +151,6 @@ Building Sphinx docs You can edit the templates for any of the pages in the docs by editing the .rst files in the "doc" directory and then running "``make html``" again. - -SageMaker Python SDK Overview ------------------------------ - -SageMaker Python SDK provides several high-level abstractions for working with Amazon SageMaker. These are: - -- **Estimators**: Encapsulate training on SageMaker. -- **Models**: Encapsulate built ML models. -- **Predictors**: Provide real-time inference and transformation using Python data-types against a SageMaker endpoint. -- **Session**: Provides a collection of methods for working with SageMaker resources. - -``Estimator`` and ``Model`` implementations for MXNet, TensorFlow, Chainer, PyTorch, and Amazon ML algorithms are included. -There's also an ``Estimator`` that runs SageMaker compatible custom Docker containers, enabling you to run your own ML algorithms by using the SageMaker Python SDK. - -The following sections of this document explain how to use the different estimators and models: - -* `MXNet SageMaker Estimators and Models <#mxnet-sagemaker-estimators>`__ -* `TensorFlow SageMaker Estimators and Models <#tensorflow-sagemaker-estimators>`__ -* `Chainer SageMaker Estimators and Models <#chainer-sagemaker-estimators>`__ -* `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__ -* `Scikit-learn SageMaker Estimators and Models <#scikit-learn-sagemaker-estimators>`__ -* `SageMaker Reinforcement Learning Estimators <#sagemaker-reinforcement-learning-estimators>`__ -* `AWS SageMaker Estimators and Models <#aws-sagemaker-estimators>`__ -* `Custom SageMaker Estimators and Models <#byo-docker-containers-with-sagemaker-estimators>`__ - - -Using Estimators ----------------- - -Here is an end to end example of how to use a SageMaker Estimator: - -.. code:: python - - from sagemaker.mxnet import MXNet - - # Configure an MXNet Estimator (no training happens yet) - mxnet_estimator = MXNet('train.py', - role='SageMakerRole', - train_instance_type='ml.p2.xlarge', - train_instance_count=1, - framework_version='1.2.1') - - # Starts a SageMaker training job and waits until completion. - mxnet_estimator.fit('s3://my_bucket/my_training_data/') - - # Deploys the model that was generated by fit() to a SageMaker endpoint - mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1, instance_type='ml.p2.xlarge') - - # Serializes data and makes a prediction request to the SageMaker endpoint - response = mxnet_predictor.predict(data) - - # Tears down the SageMaker endpoint and endpoint configuration - mxnet_predictor.delete_endpoint() - - # Deletes the SageMaker model - mxnet_predictor.delete_model() - -The example above will eventually delete both the SageMaker endpoint and endpoint configuration through `delete_endpoint()`. If you want to keep your SageMaker endpoint configuration, use the value False for the `delete_endpoint_config` parameter, as shown below. - -.. code:: python - - # Only delete the SageMaker endpoint, while keeping the corresponding endpoint configuration. - mxnet_predictor.delete_endpoint(delete_endpoint_config=False) - -Additionally, it is possible to deploy a different endpoint configuration, which links to your model, to an already existing SageMaker endpoint. -This can be done by specifying the existing endpoint name for the ``endpoint_name`` parameter along with the ``update_endpoint`` parameter as ``True`` within your ``deploy()`` call. -For more `information `__. - -.. code:: python - - from sagemaker.mxnet import MXNet - - # Configure an MXNet Estimator (no training happens yet) - mxnet_estimator = MXNet('train.py', - role='SageMakerRole', - train_instance_type='ml.p2.xlarge', - train_instance_count=1, - framework_version='1.2.1') - - # Starts a SageMaker training job and waits until completion. - mxnet_estimator.fit('s3://my_bucket/my_training_data/') - - # Deploys the model that was generated by fit() to an existing SageMaker endpoint - mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1, - instance_type='ml.p2.xlarge', - update_endpoint=True, - endpoint_name='existing-endpoint') - - # Serializes data and makes a prediction request to the SageMaker endpoint - response = mxnet_predictor.predict(data) - - # Tears down the SageMaker endpoint and endpoint configuration - mxnet_predictor.delete_endpoint() - - # Deletes the SageMaker model - mxnet_predictor.delete_model() - -Training Metrics -~~~~~~~~~~~~~~~~ -The SageMaker Python SDK allows you to specify a name and a regular expression for metrics you want to track for training. -A regular expression (regex) matches what is in the training algorithm logs, like a search function. -Here is an example of how to define metrics: - -.. code:: python - - # Configure an BYO Estimator with metric definitions (no training happens yet) - byo_estimator = Estimator(image_name=image_name, - role='SageMakerRole', train_instance_count=1, - train_instance_type='ml.c4.xlarge', - sagemaker_session=sagemaker_session, - metric_definitions=[{'Name': 'test:msd', 'Regex': '#quality_metric: host=\S+, test msd =(\S+)'}, - {'Name': 'test:ssd', 'Regex': '#quality_metric: host=\S+, test ssd =(\S+)'}]) - -All Amazon SageMaker algorithms come with built-in support for metrics. -You can go to `the AWS documentation `__ for more details about built-in metrics of each Amazon SageMaker algorithm. - -Local Mode -~~~~~~~~~~ - -The SageMaker Python SDK supports local mode, which allows you to create estimators and deploy them to your local environment. -This is a great way to test your deep learning scripts before running them in SageMaker's managed training or hosting environments. -Local Mode is supported for only frameworks (e.g. TensorFlow, MXNet) and images you supply yourself. - -We can take the example in `Using Estimators <#using-estimators>`__ , and use either ``local`` or ``local_gpu`` as the instance type. - -.. code:: python - - from sagemaker.mxnet import MXNet - - # Configure an MXNet Estimator (no training happens yet) - mxnet_estimator = MXNet('train.py', - role='SageMakerRole', - train_instance_type='local', - train_instance_count=1, - framework_version='1.2.1') - - # In Local Mode, fit will pull the MXNet container Docker image and run it locally - mxnet_estimator.fit('s3://my_bucket/my_training_data/') - - # Alternatively, you can train using data in your local file system. This is only supported in Local mode. - mxnet_estimator.fit('file:///tmp/my_training_data') - - # Deploys the model that was generated by fit() to local endpoint in a container - mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1, instance_type='local') - - # Serializes data and makes a prediction request to the local endpoint - response = mxnet_predictor.predict(data) - - # Tears down the endpoint container and deletes the corresponding endpoint configuration - mxnet_predictor.delete_endpoint() - - # Deletes the model - mxnet_predictor.delete_model() - - -If you have an existing model and want to deploy it locally, don't specify a sagemaker_session argument to the ``MXNetModel`` constructor. -The correct session is generated when you call ``model.deploy()``. - -Here is an end-to-end example: - -.. code:: python - - import numpy - from sagemaker.mxnet import MXNetModel - - model_location = 's3://mybucket/my_model.tar.gz' - code_location = 's3://mybucket/sourcedir.tar.gz' - s3_model = MXNetModel(model_data=model_location, role='SageMakerRole', - entry_point='mnist.py', source_dir=code_location) - - predictor = s3_model.deploy(initial_instance_count=1, instance_type='local') - data = numpy.zeros(shape=(1, 1, 28, 28)) - predictor.predict(data) - - # Tear down the endpoint container and delete the corresponding endpoint configuration - predictor.delete_endpoint() - - # Deletes the model - predictor.delete_model() - - -If you don't want to deploy your model locally, you can also choose to perform a Local Batch Transform Job. This is -useful if you want to test your container before creating a Sagemaker Batch Transform Job. Note that the performance -will not match Batch Transform Jobs hosted on SageMaker but it is still a useful tool to ensure you have everything -right or if you are not dealing with huge amounts of data. - -Here is an end-to-end example: - -.. code:: python - - from sagemaker.mxnet import MXNet - - mxnet_estimator = MXNet('train.py', - train_instance_type='local', - train_instance_count=1, - framework_version='1.2.1') - - mxnet_estimator.fit('file:///tmp/my_training_data') - transformer = mxnet_estimator.transformer(1, 'local', assemble_with='Line', max_payload=1) - transformer.transform('s3://my/transform/data, content_type='text/csv', split_type='Line') - transformer.wait() - - # Deletes the SageMaker model - transformer.delete_model() - - -For detailed examples of running Docker in local mode, see: - -- `TensorFlow local mode example notebook `__. -- `MXNet local mode example notebook `__. - -A few important notes: - -- Only one local mode endpoint can be running at a time. -- If you are using S3 data as input, it is pulled from S3 to your local environment. Ensure you have sufficient space to store the data locally. -- If you run into problems it often due to different Docker containers conflicting. Killing these containers and re-running often solves your problems. -- Local Mode requires Docker Compose and `nvidia-docker2 `__ for ``local_gpu``. -- Distributed training is not yet supported for ``local_gpu``. - -Incremental Training -~~~~~~~~~~~~~~~~~~~~ - -Incremental training allows you to bring a pre-trained model into a SageMaker training job and use it as a starting point for a new model. -There are several situations where you might want to do this: - -- You want to perform additional training on a model to improve its fit on your data set. -- You want to import a pre-trained model and fit it to your data. -- You want to resume a training job that you previously stopped. - -To use incremental training with SageMaker algorithms, you need model artifacts compressed into a ``tar.gz`` file. These -artifacts are passed to a training job via an input channel configured with the pre-defined settings Amazon SageMaker algorithms require. - -To use model files with a SageMaker estimator, you can use the following parameters: - -* ``model_uri``: points to the location of a model tarball, either in S3 or locally. Specifying a local path only works in local mode. -* ``model_channel_name``: name of the channel SageMaker will use to download the tarball specified in ``model_uri``. Defaults to 'model'. - -This is converted into an input channel with the specifications mentioned above once you call ``fit()`` on the predictor. -In bring-your-own cases, ``model_channel_name`` can be overriden if you require to change the name of the channel while using -the same settings. - -If your bring-your-own case requires different settings, you can create your own ``s3_input`` object with the settings you require. - -Here's an example of how to use incremental training: - -.. code:: python - - # Configure an estimator - estimator = sagemaker.estimator.Estimator(training_image, - role, - train_instance_count=1, - train_instance_type='ml.p2.xlarge', - train_volume_size=50, - train_max_run=360000, - input_mode='File', - output_path=s3_output_location) - - # Start a SageMaker training job and waits until completion. - estimator.fit('s3://my_bucket/my_training_data/') - - # Create a new estimator using the previous' model artifacts - incr_estimator = sagemaker.estimator.Estimator(training_image, - role, - train_instance_count=1, - train_instance_type='ml.p2.xlarge', - train_volume_size=50, - train_max_run=360000, - input_mode='File', - output_path=s3_output_location, - model_uri=estimator.model_data) - - # Start a SageMaker training job using the original model for incremental training - incr_estimator.fit('s3://my_bucket/my_training_data/') - -Currently, the following algorithms support incremental training: - -- Image Classification -- Object Detection -- Semantic Segmentation - - MXNet SageMaker Estimators -------------------------- @@ -459,9 +178,9 @@ Supported versions of TensorFlow for Elastic Inference: ``1.11.0``, ``1.12.0``. We recommend that you use the latest supported version, because that's where we focus most of our development efforts. -For more information, see `TensorFlow SageMaker Estimators and Models`_. +For more information, see `Using TensorFlow with the SageMaker Python SDK`_. -.. _TensorFlow SageMaker Estimators and Models: src/sagemaker/tensorflow/README.rst +.. _Using TensorFlow with the SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/using_tf.html Chainer SageMaker Estimators @@ -507,9 +226,9 @@ We recommend that you use the latest supported version, because that's where we For more information about Scikit-learn, see https://scikit-learn.org/stable/ -For more information about Scikit-learn SageMaker ``Estimators``, see `Scikit-learn SageMaker Estimators and Models`_. +For more information about Scikit-learn SageMaker ``Estimators``, see `Using Scikit-learn with the SageMaker Python SDK`_. -.. _Scikit-learn SageMaker Estimators and Models: src/sagemaker/sklearn/README.rst +.. _Using Scikit-learn with the SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/using_sklearn.html SageMaker Reinforcement Learning Estimators @@ -574,416 +293,6 @@ For more information, see `AWS SageMaker Estimators and Models`_. .. _AWS SageMaker Estimators and Models: src/sagemaker/amazon/README.rst -Using SageMaker AlgorithmEstimators ------------------------------------ - -With the SageMaker Algorithm entities, you can create training jobs with just an ``algorithm_arn`` instead of -a training image. There is a dedicated ``AlgorithmEstimator`` class that accepts ``algorithm_arn`` as a -parameter, the rest of the arguments are similar to the other Estimator classes. This class also allows you to -consume algorithms that you have subscribed to in the AWS Marketplace. The AlgorithmEstimator performs -client-side validation on your inputs based on the algorithm's properties. - -Here is an example: - -.. code:: python - - import sagemaker - - algo = sagemaker.AlgorithmEstimator( - algorithm_arn='arn:aws:sagemaker:us-west-2:1234567:algorithm/some-algorithm', - role='SageMakerRole', - train_instance_count=1, - train_instance_type='ml.c4.xlarge') - - train_input = algo.sagemaker_session.upload_data(path='/path/to/your/data') - - algo.fit({'training': train_input}) - algo.deploy(1, 'ml.m4.xlarge') - - # When you are done using your endpoint - algo.delete_endpoint() - - -Consuming SageMaker Model Packages ----------------------------------- - -SageMaker Model Packages are a way to specify and share information for how to create SageMaker Models. -With a SageMaker Model Package that you have created or subscribed to in the AWS Marketplace, -you can use the specified serving image and model data for Endpoints and Batch Transform jobs. - -To work with a SageMaker Model Package, use the ``ModelPackage`` class. - -Here is an example: - -.. code:: python - - import sagemaker - - model = sagemaker.ModelPackage( - role='SageMakerRole', - model_package_arn='arn:aws:sagemaker:us-west-2:123456:model-package/my-model-package') - model.deploy(1, 'ml.m4.xlarge', endpoint_name='my-endpoint') - - # When you are done using your endpoint - model.sagemaker_session.delete_endpoint('my-endpoint') - - -BYO Docker Containers with SageMaker Estimators ------------------------------------------------ - -To use a Docker image that you created and use the SageMaker SDK for training, the easiest way is to use the dedicated ``Estimator`` class. -You can create an instance of the ``Estimator`` class with desired Docker image and use it as described in previous sections. - -Please refer to the full example in the examples repo: - -:: - - git clone https://github.com/awslabs/amazon-sagemaker-examples.git - - -The example notebook is located here: -``advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb`` - - -SageMaker Automatic Model Tuning --------------------------------- - -All of the estimators can be used with SageMaker Automatic Model Tuning, which performs hyperparameter tuning jobs. -A hyperparameter tuning job finds the best version of a model by running many training jobs on your dataset using the algorithm with different values of hyperparameters within ranges -that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose. -If you're not using an Amazon SageMaker built-in algorithm, then the metric is defined by a regular expression (regex) you provide. -The hyperparameter tuning job parses the training job's logs to find metrics that match the regex you defined. -For more information about SageMaker Automatic Model Tuning, see `AWS documentation `__. - -The SageMaker Python SDK contains a ``HyperparameterTuner`` class for creating and interacting with hyperparameter training jobs. -Here is a basic example of how to use it: - -.. code:: python - - from sagemaker.tuner import HyperparameterTuner, ContinuousParameter - - # Configure HyperparameterTuner - my_tuner = HyperparameterTuner(estimator=my_estimator, # previously-configured Estimator object - objective_metric_name='validation-accuracy', - hyperparameter_ranges={'learning-rate': ContinuousParameter(0.05, 0.06)}, - metric_definitions=[{'Name': 'validation-accuracy', 'Regex': 'validation-accuracy=(\d\.\d+)'}], - max_jobs=100, - max_parallel_jobs=10) - - # Start hyperparameter tuning job - my_tuner.fit({'train': 's3://my_bucket/my_training_data', 'test': 's3://my_bucket_my_testing_data'}) - - # Deploy best model - my_predictor = my_tuner.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge') - - # Make a prediction against the SageMaker endpoint - response = my_predictor.predict(my_prediction_data) - - # Tear down the SageMaker endpoint - my_tuner.delete_endpoint() - -This example shows a hyperparameter tuning job that creates up to 100 training jobs, running up to 10 training jobs at a time. -Each training job's learning rate is a value between 0.05 and 0.06, but this value will differ between training jobs. -You can read more about how these values are chosen in the `AWS documentation `__. - -A hyperparameter range can be one of three types: continuous, integer, or categorical. -The SageMaker Python SDK provides corresponding classes for defining these different types. -You can define up to 20 hyperparameters to search over, but each value of a categorical hyperparameter range counts against that limit. - -By default, training job early stopping is turned off. To enable early stopping for the tuning job, you need to set the ``early_stopping_type`` parameter to ``Auto``: - -.. code:: python - - # Enable early stopping - my_tuner = HyperparameterTuner(estimator=my_estimator, # previously-configured Estimator object - objective_metric_name='validation-accuracy', - hyperparameter_ranges={'learning-rate': ContinuousParameter(0.05, 0.06)}, - metric_definitions=[{'Name': 'validation-accuracy', 'Regex': 'validation-accuracy=(\d\.\d+)'}], - max_jobs=100, - max_parallel_jobs=10, - early_stopping_type='Auto') - -When early stopping is turned on, Amazon SageMaker will automatically stop a training job if it appears unlikely to produce a model of better quality than other jobs. -If not using built-in Amazon SageMaker algorithms, note that, for early stopping to be effective, the objective metric should be emitted at epoch level. - -If you are using an Amazon SageMaker built-in algorithm, you don't need to pass in anything for ``metric_definitions``. -In addition, the ``fit()`` call uses a list of ``RecordSet`` objects instead of a dictionary: - -.. code:: python - - # Create RecordSet object for each data channel - train_records = RecordSet(...) - test_records = RecordSet(...) - - # Start hyperparameter tuning job - my_tuner.fit([train_records, test_records]) - -To help attach a previously-started hyperparameter tuning job to a ``HyperparameterTuner`` instance, -``fit()`` adds the module path of the class used to create the hyperparameter tuner to the list of static hyperparameters by default. -If you are using your own custom estimator class (i.e. not one provided in this SDK) and want that class to be used when attaching a hyperparamter tuning job, -set ``include_cls_metadata`` to ``True`` when you call ``fit`` to add the module path as static hyperparameters. - -There is also an analytics object associated with each ``HyperparameterTuner`` instance that contains useful information about the hyperparameter tuning job. -For example, the ``dataframe`` method gets a pandas dataframe summarizing the associated training jobs: - -.. code:: python - - # Retrieve analytics object - my_tuner_analytics = my_tuner.analytics() - - # Look at summary of associated training jobs - my_dataframe = my_tuner_analytics.dataframe() - -For more detailed examples of running hyperparameter tuning jobs, see: - -- `Using the TensorFlow estimator with hyperparameter tuning `__ -- `Bringing your own estimator for hyperparameter tuning `__ -- `Analyzing results `__ - -For more detailed explanations of the classes that this library provides for automatic model tuning, see: - -- `API docs for HyperparameterTuner and parameter range classes `__ -- `API docs for analytics classes `__ - - -SageMaker Batch Transform -------------------------- - -After you train a model, you can use Amazon SageMaker Batch Transform to perform inferences with the model. -Batch Transform manages all necessary compute resources, including launching instances to deploy endpoints and deleting them afterward. -You can read more about SageMaker Batch Transform in the `AWS documentation `__. - -If you trained the model using a SageMaker Python SDK estimator, -you can invoke the estimator's ``transformer()`` method to create a transform job for a model based on the training job: - -.. code:: python - - transformer = estimator.transformer(instance_count=1, instance_type='ml.m4.xlarge') - -Alternatively, if you already have a SageMaker model, you can create an instance of the ``Transformer`` class by calling its constructor: - -.. code:: python - - transformer = Transformer(model_name='my-previously-trained-model', - instance_count=1, - instance_type='ml.m4.xlarge') - -For a full list of the possible options to configure by using either of these methods, see the API docs for `Estimator `__ or `Transformer `__. - -After you create a ``Transformer`` object, you can invoke ``transform()`` to start a batch transform job with the S3 location of your data. -You can also specify other attributes of your data, such as the content type. - -.. code:: python - - transformer.transform('s3://my-bucket/batch-transform-input') - -For more details about what can be specified here, see `API docs `__. - - -Secure Training and Inference with VPC --------------------------------------- - -Amazon SageMaker allows you to control network traffic to and from model container instances using Amazon Virtual Private Cloud (VPC). -You can configure SageMaker to use your own private VPC in order to further protect and monitor traffic. - -For more information about Amazon SageMaker VPC features, and guidelines for configuring your VPC, -see the following documentation: - -- `Protect Training Jobs by Using an Amazon Virtual Private Cloud `__ -- `Protect Endpoints by Using an Amazon Virtual Private Cloud `__ -- `Protect Data in Batch Transform Jobs by Using an Amazon Virtual Private Cloud `__ -- `Working with VPCs and Subnets `__ - -You can also reference or reuse the example VPC created for integration tests: `tests/integ/vpc_test_utils.py `__ - -To train a model using your own VPC, set the optional parameters ``subnets`` and ``security_group_ids`` on an ``Estimator``: - -.. code:: python - - from sagemaker.mxnet import MXNet - - # Configure an MXNet Estimator with subnets and security groups from your VPC - mxnet_vpc_estimator = MXNet('train.py', - train_instance_type='ml.p2.xlarge', - train_instance_count=1, - framework_version='1.2.1', - subnets=['subnet-1', 'subnet-2'], - security_group_ids=['sg-1']) - - # SageMaker Training Job will set VpcConfig and container instances will run in your VPC - mxnet_vpc_estimator.fit('s3://my_bucket/my_training_data/') - -To train a model with the inter-container traffic encrypted, set the optional parameters ``subnets`` and ``security_group_ids`` and -the flag ``encrypt_inter_container_traffic`` as ``True`` on an Estimator (Note: This flag can be used only if you specify that the training -job runs in a VPC): - -.. code:: python - - from sagemaker.mxnet import MXNet - - # Configure an MXNet Estimator with subnets and security groups from your VPC - mxnet_vpc_estimator = MXNet('train.py', - train_instance_type='ml.p2.xlarge', - train_instance_count=1, - framework_version='1.2.1', - subnets=['subnet-1', 'subnet-2'], - security_group_ids=['sg-1'], - encrypt_inter_container_traffic=True) - - # The SageMaker training job sets the VpcConfig, and training container instances run in your VPC with traffic between the containers encrypted - mxnet_vpc_estimator.fit('s3://my_bucket/my_training_data/') - -When you create a ``Predictor`` from the ``Estimator`` using ``deploy()``, the same VPC configurations will be set on the SageMaker Model: - -.. code:: python - - # Creates a SageMaker Model and Endpoint using the same VpcConfig - # Endpoint container instances will run in your VPC - mxnet_vpc_predictor = mxnet_vpc_estimator.deploy(initial_instance_count=1, - instance_type='ml.p2.xlarge') - - # You can also set ``vpc_config_override`` to use a different VpcConfig - other_vpc_config = {'Subnets': ['subnet-3', 'subnet-4'], - 'SecurityGroupIds': ['sg-2']} - mxnet_predictor_other_vpc = mxnet_vpc_estimator.deploy(initial_instance_count=1, - instance_type='ml.p2.xlarge', - vpc_config_override=other_vpc_config) - - # Setting ``vpc_config_override=None`` will disable VpcConfig - mxnet_predictor_no_vpc = mxnet_vpc_estimator.deploy(initial_instance_count=1, - instance_type='ml.p2.xlarge', - vpc_config_override=None) - -Likewise, when you create ``Transformer`` from the ``Estimator`` using ``transformer()``, the same VPC configurations will be set on the SageMaker Model: - -.. code:: python - - # Creates a SageMaker Model using the same VpcConfig - mxnet_vpc_transformer = mxnet_vpc_estimator.transformer(instance_count=1, - instance_type='ml.p2.xlarge') - - # Transform Job container instances will run in your VPC - mxnet_vpc_transformer.transform('s3://my-bucket/batch-transform-input') - - -FAQ ---- - -I want to train a SageMaker Estimator with local data, how do I do this? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Upload the data to S3 before training. You can use the AWS Command Line Tool (the aws cli) to achieve this. - -If you don't have the aws cli, you can install it using pip: - -:: - - pip install awscli --upgrade --user - -If you don't have pip or want to learn more about installing the aws cli, see the official `Amazon aws cli installation guide `__. - -After you install the AWS cli, you can upload a directory of files to S3 with the following command: - -:: - - aws s3 cp /tmp/foo/ s3://bucket/path - -For more information about using the aws cli for manipulating S3 resources, see `AWS cli command reference `__. - - -How do I make predictions against an existing endpoint? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Create a ``Predictor`` object and provide it with your endpoint name, -then call its ``predict()`` method with your input. - -You can use either the generic ``RealTimePredictor`` class, which by default does not perform any serialization/deserialization transformations on your input, -but can be configured to do so through constructor arguments: -http://sagemaker.readthedocs.io/en/stable/predictors.html - -Or you can use the TensorFlow / MXNet specific predictor classes, which have default serialization/deserialization logic: -http://sagemaker.readthedocs.io/en/stable/sagemaker.tensorflow.html#tensorflow-predictor -http://sagemaker.readthedocs.io/en/stable/sagemaker.mxnet.html#mxnet-predictor - -Example code using the TensorFlow predictor: - -:: - - from sagemaker.tensorflow import TensorFlowPredictor - - predictor = TensorFlowPredictor('myexistingendpoint') - result = predictor.predict(['my request body']) - - -BYO Model ---------- -You can also create an endpoint from an existing model rather than training one. -That is, you can bring your own model: - -First, package the files for the trained model into a ``.tar.gz`` file, and upload the archive to S3. - -Next, create a ``Model`` object that corresponds to the framework that you are using: `MXNetModel `__ or `TensorFlowModel `__. - -Example code using ``MXNetModel``: - -.. code:: python - - from sagemaker.mxnet.model import MXNetModel - - sagemaker_model = MXNetModel(model_data='s3://path/to/model.tar.gz', - role='arn:aws:iam::accid:sagemaker-role', - entry_point='entry_point.py') - -After that, invoke the ``deploy()`` method on the ``Model``: - -.. code:: python - - predictor = sagemaker_model.deploy(initial_instance_count=1, - instance_type='ml.m4.xlarge') - -This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is called. You can now get inferences just like with any other model deployed on Amazon SageMaker. - -A full example is available in the `Amazon SageMaker examples repository `__. - - -Inference Pipelines -------------------- -You can create a Pipeline for realtime or batch inference comprising of one or multiple model containers. This will help -you to deploy an ML pipeline behind a single endpoint and you can have one API call perform pre-processing, model-scoring -and post-processing on your data before returning it back as the response. - -For this, you have to create a ``PipelineModel`` which will take a list of ``Model`` objects. Calling ``deploy()`` on the -``PipelineModel`` will provide you with an endpoint which can be invoked to perform the prediction on a data point against -the ML Pipeline. - -.. code:: python - - xgb_image = get_image_uri(sess.boto_region_name, 'xgboost', repo_version="latest") - xgb_model = Model(model_data='s3://path/to/model.tar.gz', image=xgb_image) - sparkml_model = SparkMLModel(model_data='s3://path/to/model.tar.gz', env={'SAGEMAKER_SPARKML_SCHEMA': schema}) - - model_name = 'inference-pipeline-model' - endpoint_name = 'inference-pipeline-endpoint' - sm_model = PipelineModel(name=model_name, role=sagemaker_role, models=[sparkml_model, xgb_model]) - -This will define a ``PipelineModel`` consisting of SparkML model and an XGBoost model stacked sequentially. For more -information about how to train an XGBoost model, please refer to the XGBoost notebook here_. - -.. _here: https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#xgboost-sample-notebooks - -.. code:: python - - sm_model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge', endpoint_name=endpoint_name) - -This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is called. Whenever you make an inference -request using this predictor, you should pass the data that the first container expects and the predictor will return the -output from the last container. - -For comprehensive examples on how to use Inference Pipelines please refer to the following notebooks: - -- `inference_pipeline_sparkml_xgboost_abalone.ipynb `__ -- `inference_pipeline_sparkml_blazingtext_dbpedia.ipynb `__ - - SageMaker Workflow ------------------ diff --git a/src/sagemaker/sklearn/README.rst b/src/sagemaker/sklearn/README.rst index c0ee4fd7b6..893ee823de 100644 --- a/src/sagemaker/sklearn/README.rst +++ b/src/sagemaker/sklearn/README.rst @@ -8,590 +8,7 @@ Supported versions of Scikit-learn: ``0.20.0`` You can visit the Scikit-learn repository at https://github.com/scikit-learn/scikit-learn. -Table of Contents ------------------ - -1. `Training with Scikit-learn <#training-with-scikit-learn>`__ -2. `Scikit-learn Estimators <#scikit-learn-estimators>`__ -3. `Saving models <#saving-models>`__ -4. `Deploying Scikit-learn models <#deploying-scikit-learn-models>`__ -5. `SageMaker Scikit-learn Model Server <#sagemaker-scikit-learn-model-server>`__ -6. `Working with Existing Model Data and Training Jobs <#working-with-existing-model-data-and-training-jobs>`__ -7. `Scikit-learn Training Examples <#scikit-learn-training-examples>`__ -8. `SageMaker Scikit-learn Docker Containers <#sagemaker-scikit-learn-docker-containers>`__ - - -Training with Scikit-learn -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Training Scikit-learn models using ``SKLearn`` Estimators is a two-step process: - -1. Prepare a Scikit-learn script to run on SageMaker -2. Run this script on SageMaker via a ``SKLearn`` Estimator. - - -First, you prepare your training script, then second, you run this on SageMaker via a ``SKLearn`` Estimator. -You should prepare your script in a separate source file than the notebook, terminal session, or source file you're -using to submit the script to SageMaker via a ``SKLearn`` Estimator. - -Suppose that you already have an Scikit-learn training script called -``sklearn-train.py``. You can run this script in SageMaker as follows: - -.. code:: python - - from sagemaker.sklearn import SKLearn - sklearn_estimator = SKLearn(entry_point='sklearn-train.py', - role='SageMakerRole', - train_instance_type='ml.m4.xlarge', - framework_version='0.20.0') - sklearn_estimator.fit('s3://bucket/path/to/training/data') - -Where the S3 URL is a path to your training data, within Amazon S3. The constructor keyword arguments define how -SageMaker runs your training script and are discussed in detail in a later section. - -In the following sections, we'll discuss how to prepare a training script for execution on SageMaker, -then how to run that script on SageMaker using a ``SKLearn`` Estimator. - -Preparing the Scikit-learn training script -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Your Scikit-learn training script must be a Python 2.7 or 3.5 compatible source file. - -The training script is very similar to a training script you might run outside of SageMaker, but you -can access useful properties about the training environment through various environment variables, such as - -* ``SM_MODEL_DIR``: A string representing the path to the directory to write model artifacts to. - These artifacts are uploaded to S3 for model hosting. -* ``SM_OUTPUT_DATA_DIR``: A string representing the filesystem path to write output artifacts to. Output artifacts may - include checkpoints, graphs, and other files to save, not including model artifacts. These artifacts are compressed - and uploaded to S3 to the same S3 prefix as the model artifacts. - -Supposing two input channels, 'train' and 'test', were used in the call to the Scikit-learn estimator's ``fit()`` method, -the following will be set, following the format "SM_CHANNEL_[channel_name]": - -* ``SM_CHANNEL_TRAIN``: A string representing the path to the directory containing data in the 'train' channel -* ``SM_CHANNEL_TEST``: Same as above, but for the 'test' channel. - -A typical training script loads data from the input channels, configures training with hyperparameters, trains a model, -and saves a model to model_dir so that it can be hosted later. Hyperparameters are passed to your script as arguments -and can be retrieved with an argparse.ArgumentParser instance. For example, a training script might start -with the following: - -.. code:: python - - import argparse - import os - - if __name__ =='__main__': - - parser = argparse.ArgumentParser() - - # hyperparameters sent by the client are passed as command-line arguments to the script. - parser.add_argument('--epochs', type=int, default=50) - parser.add_argument('--batch-size', type=int, default=64) - parser.add_argument('--learning-rate', type=float, default=0.05) - - # Data, model, and output directories - parser.add_argument('--output-data-dir', type=str, default=os.environ.get('SM_OUTPUT_DATA_DIR')) - parser.add_argument('--model-dir', type=str, default=os.environ.get('SM_MODEL_DIR')) - parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN')) - parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST')) - - args, _ = parser.parse_known_args() - - # ... load from args.train and args.test, train a model, write model to args.model_dir. - -Because the SageMaker imports your training script, you should put your training code in a main guard -(``if __name__=='__main__':``) if you are using the same script to host your model, so that SageMaker does not -inadvertently run your training code at the wrong point in execution. - -For more on training environment variables, please visit https://github.com/aws/sagemaker-containers. - -Running a Scikit-learn training script in SageMaker -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -You run Scikit-learn training scripts on SageMaker by creating ``SKLearn`` Estimators. -SageMaker training of your script is invoked when you call ``fit`` on a ``SKLearn`` Estimator. -The following code sample shows how you train a custom Scikit-learn script "sklearn-train.py", passing -in three hyperparameters ('epochs', 'batch-size', and 'learning-rate'), and using two input channel -directories ('train' and 'test'). - -.. code:: python - - sklearn_estimator = SKLearn('sklearn-train.py', - train_instance_type='ml.m4.xlarge', - framework_version='0.20.0', - hyperparameters = {'epochs': 20, 'batch-size': 64, 'learning-rate': 0.1}) - sklearn_estimator.fit({'train': 's3://my-data-bucket/path/to/my/training/data', - 'test': 's3://my-data-bucket/path/to/my/test/data'}) - - -Scikit-learn Estimators -^^^^^^^^^^^^^^^^^^^^^^^ - -The `SKLearn` constructor takes both required and optional arguments. - -Required arguments -'''''''''''''''''' - -The following are required arguments to the ``SKLearn`` constructor. When you create a Scikit-learn object, you must -include these in the constructor, either positionally or as keyword arguments. - -- ``entry_point`` Path (absolute or relative) to the Python file which - should be executed as the entry point to training. -- ``role`` An AWS IAM role (either name or full ARN). The Amazon - SageMaker training jobs and APIs that create Amazon SageMaker - endpoints use this role to access training data and model artifacts. - After the endpoint is created, the inference code might use the IAM - role, if accessing AWS resource. -- ``train_instance_type`` Type of EC2 instance to use for training, for - example, 'ml.m4.xlarge'. Please note that Scikit-learn does not have GPU support. - -Optional arguments -'''''''''''''''''' - -The following are optional arguments. When you create a ``SKLearn`` object, you can specify these as keyword arguments. - -- ``source_dir`` Path (absolute or relative) to a directory with any - other training source code dependencies including the entry point - file. Structure within this directory will be preserved when training - on SageMaker. -- ``hyperparameters`` Hyperparameters that will be used for training. - Will be made accessible as a dict[str, str] to the training code on - SageMaker. For convenience, accepts other types besides str, but - str() will be called on keys and values to convert them before - training. -- ``py_version`` Python version you want to use for executing your - model training code. -- ``train_volume_size`` Size in GB of the EBS volume to use for storing - input data during training. Must be large enough to store training - data if input_mode='File' is used (which is the default). -- ``train_max_run`` Timeout in seconds for training, after which Amazon - SageMaker terminates the job regardless of its current status. -- ``input_mode`` The input mode that the algorithm supports. Valid - modes: 'File' - Amazon SageMaker copies the training dataset from the - s3 location to a directory in the Docker container. 'Pipe' - Amazon - SageMaker streams data directly from s3 to the container via a Unix - named pipe. -- ``output_path`` s3 location where you want the training result (model - artifacts and optional output files) saved. If not specified, results - are stored to a default bucket. If the bucket with the specific name - does not exist, the estimator creates the bucket during the fit() - method execution. -- ``output_kms_key`` Optional KMS key ID to optionally encrypt training - output with. -- ``job_name`` Name to assign for the training job that the fit() - method launches. If not specified, the estimator generates a default - job name, based on the training image name and current timestamp -- ``image_name`` An alternative docker image to use for training and - serving. If specified, the estimator will use this image for training and - hosting, instead of selecting the appropriate SageMaker official image based on - framework_version and py_version. Refer to: `SageMaker Scikit-learn Docker Containers - <#sagemaker-scikit-learn-docker-containers>`_ for details on what the official images support - and where to find the source code to build your custom image. - - -Calling fit -^^^^^^^^^^^ - -You start your training script by calling ``fit`` on a ``SKLearn`` Estimator. ``fit`` takes both required and optional -arguments. - -Required arguments -'''''''''''''''''' - -- ``inputs``: This can take one of the following forms: A string - s3 URI, for example ``s3://my-bucket/my-training-data``. In this - case, the s3 objects rooted at the ``my-training-data`` prefix will - be available in the default ``train`` channel. A dict from - string channel names to s3 URIs. In this case, the objects rooted at - each s3 prefix will available as files in each channel directory. - -For example: - -.. code:: python - - {'train':'s3://my-bucket/my-training-data', - 'eval':'s3://my-bucket/my-evaluation-data'} - -.. optional-arguments-1: - -Optional arguments -'''''''''''''''''' - -- ``wait``: Defaults to True, whether to block and wait for the - training script to complete before returning. -- ``logs``: Defaults to True, whether to show logs produced by training - job in the Python session. Only meaningful when wait is True. - -###################################### -###################################### -###################################### - -Saving models -~~~~~~~~~~~~~ - -In order to save your trained Scikit-learn model for deployment on SageMaker, your training script should save your -model to a certain filesystem path called `model_dir`. This value is accessible through the environment variable -``SM_MODEL_DIR``. The following code demonstrates how to save a trained Scikit-learn model named ``model`` as -``model.joblib`` at the end of training: - -.. code:: python - - from sklearn.externals import joblib - import argparse - import os - - if __name__=='__main__': - # default to the value in environment variable `SM_MODEL_DIR`. Using args makes the script more portable. - parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR']) - args, _ = parser.parse_known_args() - - # ... train classifier `clf`, then save it to `model_dir` as file 'model.joblib' - joblib.dump(clf, os.path.join(args.model_dir, "model.joblib")) - -After your training job is complete, SageMaker will compress and upload the serialized model to S3, and your model data -will available in the s3 ``output_path`` you specified when you created the Scikit-learn Estimator. - -Deploying Scikit-learn models -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -After an Scikit-learn Estimator has been fit, you can host the newly created model in SageMaker. - -After calling ``fit``, you can call ``deploy`` on a ``SKLearn`` Estimator to create a SageMaker Endpoint. -The Endpoint runs a SageMaker-provided Scikit-learn model server and hosts the model produced by your training script, -which was run when you called ``fit``. This was the model you saved to ``model_dir``. - -``deploy`` returns a ``Predictor`` object, which you can use to do inference on the Endpoint hosting your Scikit-learn -model. Each ``Predictor`` provides a ``predict`` method which can do inference with numpy arrays or Python lists. -Inference arrays or lists are serialized and sent to the Scikit-learn model server by an ``InvokeEndpoint`` SageMaker -operation. - -``predict`` returns the result of inference against your model. By default, the inference result a NumPy array. - -.. code:: python - - # Train my estimator - sklearn_estimator = SKLearn(entry_point='train_and_deploy.py', - train_instance_type='ml.m4.xlarge', - framework_version='0.20.0') - sklearn_estimator.fit('s3://my_bucket/my_training_data/') - - # Deploy my estimator to a SageMaker Endpoint and get a Predictor - predictor = sklearn_estimator.deploy(instance_type='ml.m4.xlarge', - initial_instance_count=1) - - # `data` is a NumPy array or a Python list. - # `response` is a NumPy array. - response = predictor.predict(data) - -You use the SageMaker Scikit-learn model server to host your Scikit-learn model when you call ``deploy`` -on an ``SKLearn`` Estimator. The model server runs inside a SageMaker Endpoint, which your call to ``deploy`` creates. -You can access the name of the Endpoint by the ``name`` property on the returned ``Predictor``. - - -SageMaker Scikit-learn Model Server -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The Scikit-learn Endpoint you create with ``deploy`` runs a SageMaker Scikit-learn model server. -The model server loads the model that was saved by your training script and performs inference on the model in response -to SageMaker InvokeEndpoint API calls. - -You can configure two components of the SageMaker Scikit-learn model server: Model loading and model serving. -Model loading is the process of deserializing your saved model back into an Scikit-learn model. -Serving is the process of translating InvokeEndpoint requests to inference calls on the loaded model. - -You configure the Scikit-learn model server by defining functions in the Python source file you passed to the -Scikit-learn constructor. - -Model loading -^^^^^^^^^^^^^ - -Before a model can be served, it must be loaded. The SageMaker Scikit-learn model server loads your model by invoking a -``model_fn`` function that you must provide in your script. The ``model_fn`` should have the following signature: - -.. code:: python - - def model_fn(model_dir) - -SageMaker will inject the directory where your model files and sub-directories, saved by ``save``, have been mounted. -Your model function should return a model object that can be used for model serving. - -SageMaker provides automated serving functions that work with Gluon API ``net`` objects and Module API ``Module`` objects. If you return either of these types of objects, then you will be able to use the default serving request handling functions. - -The following code-snippet shows an example ``model_fn`` implementation. -This loads returns a Scikit-learn Classifier from a ``model.joblib`` file in the SageMaker model directory -``model_dir``. - -.. code:: python - - from sklearn.externals import joblib - import os - - def model_fn(model_dir): - clf = joblib.load(os.path.join(model_dir, "model.joblib")) - return clf - -Model serving -^^^^^^^^^^^^^ - -After the SageMaker model server has loaded your model by calling ``model_fn``, SageMaker will serve your model. -Model serving is the process of responding to inference requests, received by SageMaker InvokeEndpoint API calls. -The SageMaker Scikit-learn model server breaks request handling into three steps: - - -- input processing, -- prediction, and -- output processing. - -In a similar way to model loading, you configure these steps by defining functions in your Python source file. - -Each step involves invoking a python function, with information about the request and the return-value from the previous -function in the chain. Inside the SageMaker Scikit-learn model server, the process looks like: - -.. code:: python - - # Deserialize the Invoke request body into an object we can perform prediction on - input_object = input_fn(request_body, request_content_type) - - # Perform prediction on the deserialized object, with the loaded model - prediction = predict_fn(input_object, model) - - # Serialize the prediction result into the desired response content type - output = output_fn(prediction, response_content_type) - -The above code-sample shows the three function definitions: - -- ``input_fn``: Takes request data and deserializes the data into an - object for prediction. -- ``predict_fn``: Takes the deserialized request object and performs - inference against the loaded model. -- ``output_fn``: Takes the result of prediction and serializes this - according to the response content type. - -The SageMaker Scikit-learn model server provides default implementations of these functions. -You can provide your own implementations for these functions in your hosting script. -If you omit any definition then the SageMaker Scikit-learn model server will use its default implementation for that -function. - -The ``RealTimePredictor`` used by Scikit-learn in the SageMaker Python SDK serializes NumPy arrays to the `NPY `_ format -by default, with Content-Type ``application/x-npy``. The SageMaker Scikit-learn model server can deserialize NPY-formatted -data (along with JSON and CSV data). - -If you rely solely on the SageMaker Scikit-learn model server defaults, you get the following functionality: - -- Prediction on models that implement the ``__call__`` method -- Serialization and deserialization of NumPy arrays. - -The default ``input_fn`` and ``output_fn`` are meant to make it easy to predict on NumPy arrays. If your model expects -a NumPy array and returns a NumPy array, then these functions do not have to be overridden when sending NPY-formatted -data. - -In the following sections we describe the default implementations of input_fn, predict_fn, and output_fn. -We describe the input arguments and expected return types of each, so you can define your own implementations. - -Input processing -'''''''''''''''' - -When an InvokeEndpoint operation is made against an Endpoint running a SageMaker Scikit-learn model server, -the model server receives two pieces of information: - -- The request Content-Type, for example "application/x-npy" -- The request data body, a byte array - -The SageMaker Scikit-learn model server will invoke an "input_fn" function in your hosting script, -passing in this information. If you define an ``input_fn`` function definition, -it should return an object that can be passed to ``predict_fn`` and have the following signature: - -.. code:: python - - def input_fn(request_body, request_content_type) - -Where ``request_body`` is a byte buffer and ``request_content_type`` is a Python string - -The SageMaker Scikit-learn model server provides a default implementation of ``input_fn``. -This function deserializes JSON, CSV, or NPY encoded data into a NumPy array. - -Default NPY deserialization requires ``request_body`` to follow the `NPY `_ format. For Scikit-learn, the Python SDK -defaults to sending prediction requests with this format. - -Default json deserialization requires ``request_body`` contain a single json list. -Sending multiple json objects within the same ``request_body`` is not supported. -The list must have a dimensionality compatible with the model loaded in ``model_fn``. -The list's shape must be identical to the model's input shape, for all dimensions after the first (which first -dimension is the batch size). - -Default csv deserialization requires ``request_body`` contain one or more lines of CSV numerical data. -The data is loaded into a two-dimensional array, where each line break defines the boundaries of the first dimension. - -The example below shows a custom ``input_fn`` for preparing pickled NumPy arrays. - -.. code:: python - - import numpy as np - - def input_fn(request_body, request_content_type): - """An input_fn that loads a pickled numpy array""" - if request_content_type == "application/python-pickle": - array = np.load(StringIO(request_body)) - return array - else: - # Handle other content-types here or raise an Exception - # if the content type is not supported. - pass - - - -Prediction -'''''''''' - -After the inference request has been deserialized by ``input_fn``, the SageMaker Scikit-learn model server invokes -``predict_fn`` on the return value of ``input_fn``. - -As with ``input_fn``, you can define your own ``predict_fn`` or use the SageMaker Scikit-learn model server default. - -The ``predict_fn`` function has the following signature: - -.. code:: python - - def predict_fn(input_object, model) - -Where ``input_object`` is the object returned from ``input_fn`` and -``model`` is the model loaded by ``model_fn``. - -The default implementation of ``predict_fn`` invokes the loaded model's ``__call__`` function on ``input_object``, -and returns the resulting value. The return-type should be a NumPy array to be compatible with the default -``output_fn``. - -The example below shows an overridden ``predict_fn`` for a Logistic Regression classifier. This model accepts a -Python list and returns a tuple of predictions and prediction probabilities from the model in a NumPy array. -This ``predict_fn`` can rely on the default ``input_fn`` and ``output_fn`` because ``input_data`` is a NumPy array, -and the return value of this function is a NumPy array. - -.. code:: python - - import sklearn - import numpy as np - - def predict_fn(input_data, model): - prediction = model.predict(input_data) - pred_prob = model.predict_proba(input_data) - return np.array([prediction, pred_prob]) - -If you implement your own prediction function, you should take care to ensure that: - -- The first argument is expected to be the return value from input_fn. - If you use the default input_fn, this will be a NumPy array. -- The second argument is the loaded model. -- The return value should be of the correct type to be passed as the - first argument to ``output_fn``. If you use the default - ``output_fn``, this should be a NumPy array. - -Output processing -''''''''''''''''' - -After invoking ``predict_fn``, the model server invokes ``output_fn``, passing in the return-value from ``predict_fn`` -and the InvokeEndpoint requested response content-type. - -The ``output_fn`` has the following signature: - -.. code:: python - - def output_fn(prediction, content_type) - -Where ``prediction`` is the result of invoking ``predict_fn`` and -``content_type`` is the InvokeEndpoint requested response content-type. -The function should return a byte array of data serialized to content_type. - -The default implementation expects ``prediction`` to be an NumPy and can serialize the result to JSON, CSV, or NPY. -It accepts response content types of "application/json", "text/csv", and "application/x-npy". - -Working with existing model data and training jobs -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Attaching to existing training jobs -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -You can attach an Scikit-learn Estimator to an existing training job using the -``attach`` method. - -.. code:: python - - my_training_job_name = "MyAwesomeSKLearnTrainingJob" - sklearn_estimator = SKLearn.attach(my_training_job_name) - -After attaching, if the training job is in a Complete status, it can be -``deploy``\ ed to create a SageMaker Endpoint and return a -``Predictor``. If the training job is in progress, -attach will block and display log messages from the training job, until the training job completes. - -The ``attach`` method accepts the following arguments: - -- ``training_job_name (str):`` The name of the training job to attach - to. -- ``sagemaker_session (sagemaker.Session or None):`` The Session used - to interact with SageMaker - -Deploying Endpoints from model data -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -As well as attaching to existing training jobs, you can deploy models directly from model data in S3. -The following code sample shows how to do this, using the ``SKLearnModel`` class. - -.. code:: python - - sklearn_model = SKLearnModel(model_data="s3://bucket/model.tar.gz", role="SageMakerRole", - entry_point="transform_script.py") - - predictor = sklearn_model.deploy(instance_type="ml.c4.xlarge", initial_instance_count=1) - -The sklearn_model constructor takes the following arguments: - -- ``model_data (str):`` An S3 location of a SageMaker model data - .tar.gz file -- ``image (str):`` A Docker image URI -- ``role (str):`` An IAM role name or Arn for SageMaker to access AWS - resources on your behalf. -- ``predictor_cls (callable[string,sagemaker.Session]):`` A function to - call to create a predictor. If not None, ``deploy`` will return the - result of invoking this function on the created endpoint name -- ``env (dict[string,string]):`` Environment variables to run with - ``image`` when hosted in SageMaker. -- ``name (str):`` The model name. If None, a default model name will be - selected on each ``deploy.`` -- ``entry_point (str):`` Path (absolute or relative) to the Python file - which should be executed as the entry point to model hosting. -- ``source_dir (str):`` Optional. Path (absolute or relative) to a - directory with any other training source code dependencies including - tne entry point file. Structure within this directory will be - preserved when training on SageMaker. -- ``enable_cloudwatch_metrics (boolean):`` Optional. If true, training - and hosting containers will generate Cloudwatch metrics under the - AWS/SageMakerContainer namespace. -- ``container_log_level (int):`` Log level to use within the container. - Valid values are defined in the Python logging module. -- ``code_location (str):`` Optional. Name of the S3 bucket where your - custom code will be uploaded to. If not specified, will use the - SageMaker default bucket created by sagemaker.Session. -- ``sagemaker_session (sagemaker.Session):`` The SageMaker Session - object, used for SageMaker interaction""" - -Your model data must be a .tar.gz file in S3. SageMaker Training Job model data is saved to .tar.gz files in S3, -however if you have local data you want to deploy, you can prepare the data yourself. - -Assuming you have a local directory containg your model data named "my_model" you can tar and gzip compress the file and -upload to S3 using the following commands: - -:: - - tar -czf model.tar.gz my_model - aws s3 cp model.tar.gz s3://my-bucket/my-path/model.tar.gz - -This uploads the contents of my_model to a gzip compressed tar file to S3 in the bucket "my-bucket", with the key -"my-path/model.tar.gz". - -To run this command, you'll need the aws cli tool installed. Please refer to our `FAQ <#FAQ>`__ for more information on -installing this. +For information about using Scikit-learn with the SageMaker Python SDK, see https://sagemaker.readthedocs.io/en/stable/using_sklearn.html. Scikit-learn Training Examples ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/src/sagemaker/tensorflow/README.rst b/src/sagemaker/tensorflow/README.rst index cc930de463..bfcde6cb84 100644 --- a/src/sagemaker/tensorflow/README.rst +++ b/src/sagemaker/tensorflow/README.rst @@ -22,433 +22,7 @@ Documentation of the previous Legacy Mode versions: `1.4.1 `__. - -A typical training script loads data from the input channels, configures training with hyperparameters, trains a model, and saves a model to ``SM_CHANNEL_TRAIN`` so that it can be deployed for inference later. -Hyperparameters are passed to your script as arguments and can be retrieved with an ``argparse.ArgumentParser`` instance. -For example, a training script might start with the following: - -.. code:: python - - import argparse - import os - - if __name__ =='__main__': - - parser = argparse.ArgumentParser() - - # hyperparameters sent by the client are passed as command-line arguments to the script. - parser.add_argument('--epochs', type=int, default=10) - parser.add_argument('--batch_size', type=int, default=100) - parser.add_argument('--learning_rate', type=float, default=0.1) - - # input data and model directories - parser.add_argument('--model_dir', type=str) - parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN')) - parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST')) - - args, _ = parser.parse_known_args() - - # ... load from args.train and args.test, train a model, write model to args.model_dir. - -Because the SageMaker imports your training script, putting your training launching code in a main guard (``if __name__=='__main__':``) -is good practice. - -Note that SageMaker doesn't support argparse actions. -If you want to use, for example, boolean hyperparameters, you need to specify ``type`` as ``bool`` in your script and provide an explicit ``True`` or ``False`` value for this hyperparameter when instantiating your TensorFlow estimator. - -Adapting your local TensorFlow script -''''''''''''''''''''''''''''''''''''' - -If you have a TensorFlow training script that runs outside of SageMaker please follow the directions here: - -1. Make sure your script can handle ``--model_dir`` as an additional command line argument. If you did not specify a -location when the TensorFlow estimator is constructed a S3 location under the default training job bucket will be passed -in here. Distributed training with parameter servers requires you use the ``tf.estimator.train_and_evaluate`` API and -a S3 location is needed as the model directory during training. Here is an example: - -.. code:: python - - estimator = tf.estimator.Estimator(model_fn=my_model_fn, model_dir=args.model_dir) - ... - train_spec = tf.estimator.TrainSpec(train_input_fn, max_steps=1000) - eval_spec = tf.estimator.EvalSpec(eval_input_fn) - tf.estimator.train_and_evaluate(mnist_classifier, train_spec, eval_spec) - -2. Load input data from the input channels. The input channels are defined when ``fit`` is called. For example: - -.. code:: python - - estimator.fit({'train':'s3://my-bucket/my-training-data', - 'eval':'s3://my-bucket/my-evaluation-data'}) - -In your training script the channels will be stored in environment variables ``SM_CHANNEL_TRAIN`` and -``SM_CHANNEL_EVAL``. You can add them to your argument parsing logic like this: - -.. code:: python - - parser = argparse.ArgumentParser() - parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN')) - parser.add_argument('--eval', type=str, default=os.environ.get('SM_CHANNEL_EVAL')) - -3. Export your final model to path stored in environment variable ``SM_MODEL_DIR`` which should always be - ``/opt/ml/model``. At end of training SageMaker will upload the model file under ``/opt/ml/model`` to - ``output_path``. - - -Training with TensorFlow estimator -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Calling fit -''''''''''' - -To use Script Mode, set at least one of these args - -- ``py_version='py3'`` -- ``script_mode=True`` - -Please note that when using Script Mode, your training script need to accept the following args: - -- ``model_dir`` - -Please note that the following args are not permitted when using Script Mode: - -- ``checkpoint_path`` -- ``training_steps`` -- ``evaluation_steps`` -- ``requirements_file`` - -.. code:: python - - from sagemaker.tensorflow import TensorFlow - - tf_estimator = TensorFlow(entry_point='tf-train.py', role='SageMakerRole', - train_instance_count=1, train_instance_type='ml.p2.xlarge', - framework_version='1.12', py_version='py3') - tf_estimator.fit('s3://bucket/path/to/training/data') - -Where the S3 url is a path to your training data, within Amazon S3. The -constructor keyword arguments define how SageMaker runs your training -script which we discussed earlier. - -You start your training script by calling ``fit`` on a ``TensorFlow`` estimator. ``fit`` takes -both required and optional arguments. - -Required argument -""""""""""""""""" - -- ``inputs``: The S3 location(s) of datasets to be used for training. This can take one of two forms: - - - ``str``: An S3 URI, for example ``s3://my-bucket/my-training-data``, which indicates the dataset's location. - - ``dict[str, str]``: A dictionary mapping channel names to S3 locations, for example ``{'train': 's3://my-bucket/my-training-data/train', 'test': 's3://my-bucket/my-training-data/test'}`` - - ``sagemaker.session.s3_input``: channel configuration for S3 data sources that can provide additional information as well as the path to the training dataset. See `the API docs `_ for full details. - -Optional arguments -"""""""""""""""""" - -- ``wait (bool)``: Defaults to True, whether to block and wait for the - training script to complete before returning. - If set to False, it will return immediately, and can later be attached to. -- ``logs (bool)``: Defaults to True, whether to show logs produced by training - job in the Python session. Only meaningful when wait is True. -- ``run_tensorboard_locally (bool)``: Defaults to False. If set to True a Tensorboard command will be printed out. -- ``job_name (str)``: Training job name. If not specified, the estimator generates a default job name, - based on the training image name and current timestamp. - -What happens when fit is called -""""""""""""""""""""""""""""""" - -Calling ``fit`` starts a SageMaker training job. The training job will execute the following. - -- Starts ``train_instance_count`` EC2 instances of the type ``train_instance_type``. -- On each instance, it will do the following steps: - - - starts a Docker container optimized for TensorFlow. - - downloads the dataset. - - setup up training related environment varialbes - - setup up distributed training environment if configured to use parameter server - - starts asynchronous training - -If the ``wait=False`` flag is passed to ``fit``, then it will return immediately. The training job will continue running -asynchronously. At a later time, a Tensorflow Estimator can be obtained by attaching to the existing training job. If -the training job is not finished it will start showing the standard output of training and wait until it completes. -After attaching, the estimator can be deployed as usual. - -.. code:: python - - tf_estimator.fit(your_input_data, wait=False) - training_job_name = tf_estimator.latest_training_job.name - - # after some time, or in a separate Python notebook, we can attach to it again. - - tf_estimator = TensorFlow.attach(training_job_name=training_job_name) - -Distributed Training -'''''''''''''''''''' - -To run your training job with multiple instances in a distributed fashion, set ``train_instance_count`` -to a number larger than 1. We support two different types of distributed training, parameter server and Horovod. -The ``distributions`` parameter is used to configure which distributed training strategy to use. - -Training with parameter servers -""""""""""""""""""""""""""""""" - -If you specify parameter_server as the value of the distributions parameter, the container launches a parameter server -thread on each instance in the training cluster, and then executes your training code. You can find more information on -TensorFlow distributed training at `TensorFlow docs `__. -To enable parameter server training: - -.. code:: python - - from sagemaker.tensorflow import TensorFlow - - tf_estimator = TensorFlow(entry_point='tf-train.py', role='SageMakerRole', - train_instance_count=2, train_instance_type='ml.p2.xlarge', - framework_version='1.11', py_version='py3', - distributions={'parameter_server': {'enabled': True}}) - tf_estimator.fit('s3://bucket/path/to/training/data') - -Training with Horovod -""""""""""""""""""""" - -Horovod is a distributed training framework based on MPI. Horovod is only available with TensorFlow version ``1.12`` or newer. -You can find more details at `Horovod README `__. - -The container sets up the MPI environment and executes the ``mpirun`` command enabling you to run any Horovod -training script with Script Mode. - -Training with ``MPI`` is configured by specifying following fields in ``distributions``: - -- ``enabled (bool)``: If set to ``True``, the MPI setup is performed and ``mpirun`` command is executed. -- ``processes_per_host (int)``: Number of processes MPI should launch on each host. Note, this should not be - greater than the available slots on the selected instance type. This flag should be set for the multi-cpu/gpu - training. -- ``custom_mpi_options (str)``: Any `mpirun` flag(s) can be passed in this field that will be added to the `mpirun` - command executed by SageMaker to launch distributed horovod training. - - -In the below example we create an estimator to launch Horovod distributed training with 2 processes on one host: - -.. code:: python - - from sagemaker.tensorflow import TensorFlow - - tf_estimator = TensorFlow(entry_point='tf-train.py', role='SageMakerRole', - train_instance_count=1, train_instance_type='ml.p2.xlarge', - framework_version='1.12', py_version='py3', - distributions={ - 'mpi': { - 'enabled': True, - 'processes_per_host': 2, - 'custom_mpi_options': '--NCCL_DEBUG INFO' - } - }) - tf_estimator.fit('s3://bucket/path/to/training/data') - -sagemaker.tensorflow.TensorFlow class -''''''''''''''''''''''''''''''''''''' - -The ``TensorFlow`` constructor takes both required and optional arguments. - -Required: - -- ``entry_point (str)`` Path (absolute or relative) to the Python file which - should be executed as the entry point to training. -- ``role (str)`` An AWS IAM role (either name or full ARN). The Amazon - SageMaker training jobs and APIs that create Amazon SageMaker - endpoints use this role to access training data and model artifacts. - After the endpoint is created, the inference code might use the IAM - role, if accessing AWS resource. -- ``train_instance_count (int)`` Number of Amazon EC2 instances to use for - training. -- ``train_instance_type (str)`` Type of EC2 instance to use for training, for - example, 'ml.c4.xlarge'. - -Optional: - -- ``source_dir (str)`` Path (absolute or relative) to a directory with any - other training source code dependencies including the entry point - file. Structure within this directory will be preserved when training - on SageMaker. -- ``dependencies (list[str])`` A list of paths to directories (absolute or relative) with - any additional libraries that will be exported to the container (default: ``[]``). - The library folders will be copied to SageMaker in the same folder where the entrypoint is copied. - If the ``source_dir`` points to S3, code will be uploaded and the S3 location will be used - instead. Example: - - The following call - - >>> TensorFlow(entry_point='train.py', dependencies=['my/libs/common', 'virtual-env']) - - results in the following inside the container: - - >>> opt/ml/code - >>> ├── train.py - >>> ├── common - >>> └── virtual-env - -- ``hyperparameters (dict[str, ANY])`` Hyperparameters that will be used for training. - Will be made accessible as command line arguments. -- ``train_volume_size (int)`` Size in GB of the EBS volume to use for storing - input data during training. Must be large enough to the store training - data. -- ``train_max_run (int)`` Timeout in seconds for training, after which Amazon - SageMaker terminates the job regardless of its current status. -- ``output_path (str)`` S3 location where you want the training result (model - artifacts and optional output files) saved. If not specified, results - are stored to a default bucket. If the bucket with the specific name - does not exist, the estimator creates the bucket during the ``fit`` - method execution. -- ``output_kms_key`` Optional KMS key ID to optionally encrypt training - output with. -- ``base_job_name`` Name to assign for the training job that the ``fit`` - method launches. If not specified, the estimator generates a default - job name, based on the training image name and current timestamp. -- ``image_name`` An alternative docker image to use for training and - serving. If specified, the estimator will use this image for training and - hosting, instead of selecting the appropriate SageMaker official image based on - ``framework_version`` and ``py_version``. Refer to: `SageMaker TensorFlow Docker Containers - <#sagemaker-tensorflow-docker-containers>`_ for details on what the official images support - and where to find the source code to build your custom image. -- ``script_mode (bool)`` Whether to use Script Mode or not. Script mode is the only available training mode in Python 3, - setting ``py_version`` to ``py3`` automatically sets ``script_mode`` to True. -- ``model_dir (str)`` Location where model data, checkpoint data, and TensorBoard checkpoints should be saved during training. - If not specified a S3 location will be generated under the training job's default bucket. And ``model_dir`` will be - passed in your training script as one of the command line arguments. -- ``distributions (dict)`` Configure your distribution strategy with this argument. - -Training with Pipe Mode using PipeModeDataset -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Amazon SageMaker allows users to create training jobs using Pipe input mode. -With Pipe input mode, your dataset is streamed directly to your training instances instead of being downloaded first. -This means that your training jobs start sooner, finish quicker, and need less disk space. - -SageMaker TensorFlow provides an implementation of ``tf.data.Dataset`` that makes it easy to take advantage of Pipe -input mode in SageMaker. You can replace your ``tf.data.Dataset`` with a ``sagemaker_tensorflow.PipeModeDataset`` to -read TFRecords as they are streamed to your training instances. - -In your ``entry_point`` script, you can use ``PipeModeDataset`` like a ``Dataset``. In this example, we create a -``PipeModeDataset`` to read TFRecords from the 'training' channel: - - -.. code:: python - - from sagemaker_tensorflow import PipeModeDataset - - features = { - 'data': tf.FixedLenFeature([], tf.string), - 'labels': tf.FixedLenFeature([], tf.int64), - } - - def parse(record): - parsed = tf.parse_single_example(record, features) - return ({ - 'data': tf.decode_raw(parsed['data'], tf.float64) - }, parsed['labels']) - - def train_input_fn(training_dir, hyperparameters): - ds = PipeModeDataset(channel='training', record_format='TFRecord') - ds = ds.repeat(20) - ds = ds.prefetch(10) - ds = ds.map(parse, num_parallel_calls=10) - ds = ds.batch(64) - return ds - - -To run training job with Pipe input mode, pass in ``input_mode='Pipe'`` to your TensorFlow Estimator: - - -.. code:: python - - from sagemaker.tensorflow import TensorFlow - - tf_estimator = TensorFlow(entry_point='tf-train-with-pipemodedataset.py', role='SageMakerRole', - training_steps=10000, evaluation_steps=100, - train_instance_count=1, train_instance_type='ml.p2.xlarge', - framework_version='1.10.0', input_mode='Pipe') - - tf_estimator.fit('s3://bucket/path/to/training/data') - - -If your TFRecords are compressed, you can train on Gzipped TF Records by passing in ``compression='Gzip'`` to the call to -``fit()``, and SageMaker will automatically unzip the records as data is streamed to your training instances: - -.. code:: python - - from sagemaker.session import s3_input - - train_s3_input = s3_input('s3://bucket/path/to/training/data', compression='Gzip') - tf_estimator.fit(train_s3_input) - - -You can learn more about ``PipeModeDataset`` in the sagemaker-tensorflow-extensions repository: https://github.com/aws/sagemaker-tensorflow-extensions - - -Training with MKL-DNN disabled -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -SageMaker TensorFlow CPU images use TensorFlow built with Intel® MKL-DNN optimization. - -In certain cases you might be able to get a better performance by disabling this optimization -(`for example when using small models `_) - -You can disable MKL-DNN optimization for TensorFlow ``1.8.0`` and above by setting two following environment variables: - -.. code:: python - - import os - - os.environ['TF_DISABLE_MKL'] = '1' - os.environ['TF_DISABLE_POOL_ALLOCATOR'] = '1' - - -Deploying TensorFlow Serving models -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -After a TensorFlow estimator has been fit, it saves a TensorFlow SavedModel in -the S3 location defined by ``output_path``. You can call ``deploy`` on a TensorFlow -estimator to create a SageMaker Endpoint. - -SageMaker provides two different options for deploying TensorFlow models to a SageMaker -Endpoint: - -- The first option uses a Python-based server that allows you to specify your own custom - input and output handling functions in a Python script. This is the default option. - - See `Deploying to Python-based Endpoints `_ to learn how to use this option. - - -- The second option uses a TensorFlow Serving-based server to provide a super-set of the - `TensorFlow Serving REST API `_. This option - does not require (or allow) a custom python script. - - See `Deploying to TensorFlow Serving Endpoints `_ to learn how to use this option. - +For information about using TensorFlow with the SageMaker Python SDK, see https://sagemaker.readthedocs.io/en/stable/using_tf.html. SageMaker TensorFlow Docker containers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From deecd668954c9f6219b492d9f1da05844b8e7adf Mon Sep 17 00:00:00 2001 From: Slesar Date: Mon, 11 Mar 2019 16:56:22 -0700 Subject: [PATCH 2/6] fix additional merge conflict --- README.rst | 453 ----------------------------------------------------- 1 file changed, 453 deletions(-) diff --git a/README.rst b/README.rst index 3d79dc86c0..36ee8969d6 100644 --- a/README.rst +++ b/README.rst @@ -293,456 +293,3 @@ For more information, see `AWS SageMaker Estimators and Models`_. .. _AWS SageMaker Estimators and Models: src/sagemaker/amazon/README.rst -<<<<<<< HEAD -======= -Using SageMaker AlgorithmEstimators ------------------------------------ - -With the SageMaker Algorithm entities, you can create training jobs with just an ``algorithm_arn`` instead of -a training image. There is a dedicated ``AlgorithmEstimator`` class that accepts ``algorithm_arn`` as a -parameter, the rest of the arguments are similar to the other Estimator classes. This class also allows you to -consume algorithms that you have subscribed to in the AWS Marketplace. The AlgorithmEstimator performs -client-side validation on your inputs based on the algorithm's properties. - -Here is an example: - -.. code:: python - - import sagemaker - - algo = sagemaker.AlgorithmEstimator( - algorithm_arn='arn:aws:sagemaker:us-west-2:1234567:algorithm/some-algorithm', - role='SageMakerRole', - train_instance_count=1, - train_instance_type='ml.c4.xlarge') - - train_input = algo.sagemaker_session.upload_data(path='/path/to/your/data') - - algo.fit({'training': train_input}) - algo.deploy(1, 'ml.m4.xlarge') - - # When you are done using your endpoint - algo.delete_endpoint() - - -Consuming SageMaker Model Packages ----------------------------------- - -SageMaker Model Packages are a way to specify and share information for how to create SageMaker Models. -With a SageMaker Model Package that you have created or subscribed to in the AWS Marketplace, -you can use the specified serving image and model data for Endpoints and Batch Transform jobs. - -To work with a SageMaker Model Package, use the ``ModelPackage`` class. - -Here is an example: - -.. code:: python - - import sagemaker - - model = sagemaker.ModelPackage( - role='SageMakerRole', - model_package_arn='arn:aws:sagemaker:us-west-2:123456:model-package/my-model-package') - model.deploy(1, 'ml.m4.xlarge', endpoint_name='my-endpoint') - - # When you are done using your endpoint - model.sagemaker_session.delete_endpoint('my-endpoint') - - -BYO Docker Containers with SageMaker Estimators ------------------------------------------------ - -To use a Docker image that you created and use the SageMaker SDK for training, the easiest way is to use the dedicated ``Estimator`` class. -You can create an instance of the ``Estimator`` class with desired Docker image and use it as described in previous sections. - -Please refer to the full example in the examples repo: - -:: - - git clone https://github.com/awslabs/amazon-sagemaker-examples.git - - -The example notebook is located here: -``advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb`` - - -SageMaker Automatic Model Tuning --------------------------------- - -All of the estimators can be used with SageMaker Automatic Model Tuning, which performs hyperparameter tuning jobs. -A hyperparameter tuning job finds the best version of a model by running many training jobs on your dataset using the algorithm with different values of hyperparameters within ranges -that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose. -If you're not using an Amazon SageMaker built-in algorithm, then the metric is defined by a regular expression (regex) you provide. -The hyperparameter tuning job parses the training job's logs to find metrics that match the regex you defined. -For more information about SageMaker Automatic Model Tuning, see `AWS documentation `__. - -The SageMaker Python SDK contains a ``HyperparameterTuner`` class for creating and interacting with hyperparameter training jobs. -Here is a basic example of how to use it: - -.. code:: python - - from sagemaker.tuner import HyperparameterTuner, ContinuousParameter - - # Configure HyperparameterTuner - my_tuner = HyperparameterTuner(estimator=my_estimator, # previously-configured Estimator object - objective_metric_name='validation-accuracy', - hyperparameter_ranges={'learning-rate': ContinuousParameter(0.05, 0.06)}, - metric_definitions=[{'Name': 'validation-accuracy', 'Regex': 'validation-accuracy=(\d\.\d+)'}], - max_jobs=100, - max_parallel_jobs=10) - - # Start hyperparameter tuning job - my_tuner.fit({'train': 's3://my_bucket/my_training_data', 'test': 's3://my_bucket_my_testing_data'}) - - # Deploy best model - my_predictor = my_tuner.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge') - - # Make a prediction against the SageMaker endpoint - response = my_predictor.predict(my_prediction_data) - - # Tear down the SageMaker endpoint - my_tuner.delete_endpoint() - -This example shows a hyperparameter tuning job that creates up to 100 training jobs, running up to 10 training jobs at a time. -Each training job's learning rate is a value between 0.05 and 0.06, but this value will differ between training jobs. -You can read more about how these values are chosen in the `AWS documentation `__. - -A hyperparameter range can be one of three types: continuous, integer, or categorical. -The SageMaker Python SDK provides corresponding classes for defining these different types. -You can define up to 20 hyperparameters to search over, but each value of a categorical hyperparameter range counts against that limit. - -By default, training job early stopping is turned off. To enable early stopping for the tuning job, you need to set the ``early_stopping_type`` parameter to ``Auto``: - -.. code:: python - - # Enable early stopping - my_tuner = HyperparameterTuner(estimator=my_estimator, # previously-configured Estimator object - objective_metric_name='validation-accuracy', - hyperparameter_ranges={'learning-rate': ContinuousParameter(0.05, 0.06)}, - metric_definitions=[{'Name': 'validation-accuracy', 'Regex': 'validation-accuracy=(\d\.\d+)'}], - max_jobs=100, - max_parallel_jobs=10, - early_stopping_type='Auto') - -When early stopping is turned on, Amazon SageMaker will automatically stop a training job if it appears unlikely to produce a model of better quality than other jobs. -If not using built-in Amazon SageMaker algorithms, note that, for early stopping to be effective, the objective metric should be emitted at epoch level. - -If you are using an Amazon SageMaker built-in algorithm, you don't need to pass in anything for ``metric_definitions``. -In addition, the ``fit()`` call uses a list of ``RecordSet`` objects instead of a dictionary: - -.. code:: python - - # Create RecordSet object for each data channel - train_records = RecordSet(...) - test_records = RecordSet(...) - - # Start hyperparameter tuning job - my_tuner.fit([train_records, test_records]) - -To help attach a previously-started hyperparameter tuning job to a ``HyperparameterTuner`` instance, -``fit()`` adds the module path of the class used to create the hyperparameter tuner to the list of static hyperparameters by default. -If you are using your own custom estimator class (i.e. not one provided in this SDK) and want that class to be used when attaching a hyperparamter tuning job, -set ``include_cls_metadata`` to ``True`` when you call ``fit`` to add the module path as static hyperparameters. - -There is also an analytics object associated with each ``HyperparameterTuner`` instance that contains useful information about the hyperparameter tuning job. -For example, the ``dataframe`` method gets a pandas dataframe summarizing the associated training jobs: - -.. code:: python - - # Retrieve analytics object - my_tuner_analytics = my_tuner.analytics() - - # Look at summary of associated training jobs - my_dataframe = my_tuner_analytics.dataframe() - -For more detailed examples of running hyperparameter tuning jobs, see: - -- `Using the TensorFlow estimator with hyperparameter tuning `__ -- `Bringing your own estimator for hyperparameter tuning `__ -- `Analyzing results `__ - -For more detailed explanations of the classes that this library provides for automatic model tuning, see: - -- `API docs for HyperparameterTuner and parameter range classes `__ -- `API docs for analytics classes `__ - - -SageMaker Batch Transform -------------------------- - -After you train a model, you can use Amazon SageMaker Batch Transform to perform inferences with the model. -Batch Transform manages all necessary compute resources, including launching instances to deploy endpoints and deleting them afterward. -You can read more about SageMaker Batch Transform in the `AWS documentation `__. - -If you trained the model using a SageMaker Python SDK estimator, -you can invoke the estimator's ``transformer()`` method to create a transform job for a model based on the training job: - -.. code:: python - - transformer = estimator.transformer(instance_count=1, instance_type='ml.m4.xlarge') - -Alternatively, if you already have a SageMaker model, you can create an instance of the ``Transformer`` class by calling its constructor: - -.. code:: python - - transformer = Transformer(model_name='my-previously-trained-model', - instance_count=1, - instance_type='ml.m4.xlarge') - -For a full list of the possible options to configure by using either of these methods, see the API docs for `Estimator `__ or `Transformer `__. - -After you create a ``Transformer`` object, you can invoke ``transform()`` to start a batch transform job with the S3 location of your data. -You can also specify other attributes of your data, such as the content type. - -.. code:: python - - transformer.transform('s3://my-bucket/batch-transform-input') - -For more details about what can be specified here, see `API docs `__. - - -Secure Training and Inference with VPC --------------------------------------- - -Amazon SageMaker allows you to control network traffic to and from model container instances using Amazon Virtual Private Cloud (VPC). -You can configure SageMaker to use your own private VPC in order to further protect and monitor traffic. - -For more information about Amazon SageMaker VPC features, and guidelines for configuring your VPC, -see the following documentation: - -- `Protect Training Jobs by Using an Amazon Virtual Private Cloud `__ -- `Protect Endpoints by Using an Amazon Virtual Private Cloud `__ -- `Protect Data in Batch Transform Jobs by Using an Amazon Virtual Private Cloud `__ -- `Working with VPCs and Subnets `__ - -You can also reference or reuse the example VPC created for integration tests: `tests/integ/vpc_test_utils.py `__ - -To train a model using your own VPC, set the optional parameters ``subnets`` and ``security_group_ids`` on an ``Estimator``: - -.. code:: python - - from sagemaker.mxnet import MXNet - - # Configure an MXNet Estimator with subnets and security groups from your VPC - mxnet_vpc_estimator = MXNet('train.py', - train_instance_type='ml.p2.xlarge', - train_instance_count=1, - framework_version='1.2.1', - subnets=['subnet-1', 'subnet-2'], - security_group_ids=['sg-1']) - - # SageMaker Training Job will set VpcConfig and container instances will run in your VPC - mxnet_vpc_estimator.fit('s3://my_bucket/my_training_data/') - -To train a model with the inter-container traffic encrypted, set the optional parameters ``subnets`` and ``security_group_ids`` and -the flag ``encrypt_inter_container_traffic`` as ``True`` on an Estimator (Note: This flag can be used only if you specify that the training -job runs in a VPC): - -.. code:: python - - from sagemaker.mxnet import MXNet - - # Configure an MXNet Estimator with subnets and security groups from your VPC - mxnet_vpc_estimator = MXNet('train.py', - train_instance_type='ml.p2.xlarge', - train_instance_count=1, - framework_version='1.2.1', - subnets=['subnet-1', 'subnet-2'], - security_group_ids=['sg-1'], - encrypt_inter_container_traffic=True) - - # The SageMaker training job sets the VpcConfig, and training container instances run in your VPC with traffic between the containers encrypted - mxnet_vpc_estimator.fit('s3://my_bucket/my_training_data/') - -When you create a ``Predictor`` from the ``Estimator`` using ``deploy()``, the same VPC configurations will be set on the SageMaker Model: - -.. code:: python - - # Creates a SageMaker Model and Endpoint using the same VpcConfig - # Endpoint container instances will run in your VPC - mxnet_vpc_predictor = mxnet_vpc_estimator.deploy(initial_instance_count=1, - instance_type='ml.p2.xlarge') - - # You can also set ``vpc_config_override`` to use a different VpcConfig - other_vpc_config = {'Subnets': ['subnet-3', 'subnet-4'], - 'SecurityGroupIds': ['sg-2']} - mxnet_predictor_other_vpc = mxnet_vpc_estimator.deploy(initial_instance_count=1, - instance_type='ml.p2.xlarge', - vpc_config_override=other_vpc_config) - - # Setting ``vpc_config_override=None`` will disable VpcConfig - mxnet_predictor_no_vpc = mxnet_vpc_estimator.deploy(initial_instance_count=1, - instance_type='ml.p2.xlarge', - vpc_config_override=None) - -Likewise, when you create ``Transformer`` from the ``Estimator`` using ``transformer()``, the same VPC configurations will be set on the SageMaker Model: - -.. code:: python - - # Creates a SageMaker Model using the same VpcConfig - mxnet_vpc_transformer = mxnet_vpc_estimator.transformer(instance_count=1, - instance_type='ml.p2.xlarge') - - # Transform Job container instances will run in your VPC - mxnet_vpc_transformer.transform('s3://my-bucket/batch-transform-input') - - -FAQ ---- - -I want to train a SageMaker Estimator with local data, how do I do this? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Upload the data to S3 before training. You can use the AWS Command Line Tool (the aws cli) to achieve this. - -If you don't have the aws cli, you can install it using pip: - -:: - - pip install awscli --upgrade --user - -If you don't have pip or want to learn more about installing the aws cli, see the official `Amazon aws cli installation guide `__. - -After you install the AWS cli, you can upload a directory of files to S3 with the following command: - -:: - - aws s3 cp /tmp/foo/ s3://bucket/path - -For more information about using the aws cli for manipulating S3 resources, see `AWS cli command reference `__. - - -How do I make predictions against an existing endpoint? -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Create a ``Predictor`` object and provide it with your endpoint name, -then call its ``predict()`` method with your input. - -You can use either the generic ``RealTimePredictor`` class, which by default does not perform any serialization/deserialization transformations on your input, -but can be configured to do so through constructor arguments: -http://sagemaker.readthedocs.io/en/stable/predictors.html - -Or you can use the TensorFlow / MXNet specific predictor classes, which have default serialization/deserialization logic: -http://sagemaker.readthedocs.io/en/stable/sagemaker.tensorflow.html#tensorflow-predictor -http://sagemaker.readthedocs.io/en/stable/sagemaker.mxnet.html#mxnet-predictor - -Example code using the TensorFlow predictor: - -:: - - from sagemaker.tensorflow import TensorFlowPredictor - - predictor = TensorFlowPredictor('myexistingendpoint') - result = predictor.predict(['my request body']) - - -BYO Model ---------- -You can also create an endpoint from an existing model rather than training one. -That is, you can bring your own model: - -First, package the files for the trained model into a ``.tar.gz`` file, and upload the archive to S3. - -Next, create a ``Model`` object that corresponds to the framework that you are using: `MXNetModel `__ or `TensorFlowModel `__. - -Example code using ``MXNetModel``: - -.. code:: python - - from sagemaker.mxnet.model import MXNetModel - - sagemaker_model = MXNetModel(model_data='s3://path/to/model.tar.gz', - role='arn:aws:iam::accid:sagemaker-role', - entry_point='entry_point.py') - -After that, invoke the ``deploy()`` method on the ``Model``: - -.. code:: python - - predictor = sagemaker_model.deploy(initial_instance_count=1, - instance_type='ml.m4.xlarge') - -This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is called. You can now get inferences just like with any other model deployed on Amazon SageMaker. - -A full example is available in the `Amazon SageMaker examples repository `__. - - -Inference Pipelines -------------------- -You can create a Pipeline for realtime or batch inference comprising of one or multiple model containers. This will help -you to deploy an ML pipeline behind a single endpoint and you can have one API call perform pre-processing, model-scoring -and post-processing on your data before returning it back as the response. - -For this, you have to create a ``PipelineModel`` which will take a list of ``Model`` objects. Calling ``deploy()`` on the -``PipelineModel`` will provide you with an endpoint which can be invoked to perform the prediction on a data point against -the ML Pipeline. - -.. code:: python - - xgb_image = get_image_uri(sess.boto_region_name, 'xgboost', repo_version="latest") - xgb_model = Model(model_data='s3://path/to/model.tar.gz', image=xgb_image) - sparkml_model = SparkMLModel(model_data='s3://path/to/model.tar.gz', env={'SAGEMAKER_SPARKML_SCHEMA': schema}) - - model_name = 'inference-pipeline-model' - endpoint_name = 'inference-pipeline-endpoint' - sm_model = PipelineModel(name=model_name, role=sagemaker_role, models=[sparkml_model, xgb_model]) - -This will define a ``PipelineModel`` consisting of SparkML model and an XGBoost model stacked sequentially. - -For more information about how to train an XGBoost model, please refer to the XGBoost notebook here_. - -.. _here: https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#xgboost-sample-notebooks - -.. code:: python - - sm_model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge', endpoint_name=endpoint_name) - -This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is called. Whenever you make an inference -request using this predictor, you should pass the data that the first container expects and the predictor will return the -output from the last container. - -You can also use a ``PipelineModel`` to create Transform Jobs for batch transformations. Using the same ``PipelineModel`` ``sm_model`` as above: - -.. code:: python - - # Only instance_type and instance_count are required. - transformer = sm_model.transformer(instance_type='ml.c5.xlarge', - instance_count=1, - strategy='MultiRecord', - max_payload=6, - max_concurrent_transforms=8, - accept='text/csv', - assemble_with='Line', - output_path='s3://my-output-bucket/path/to/my/output/data/') - # Only data is required. - transformer.transform(data='s3://my-input-bucket/path/to/my/csv/data', - content_type='text/csv', - split_type='Line') - # Waits for the Pipeline Transform Job to finish. - transformer.wait() - -This runs a transform job against all the files under ``s3://mybucket/path/to/my/csv/data``, transforming the input -data in order with each model container in the pipeline. For each input file that was successfully transformed, one output file in ``s3://my-output-bucket/path/to/my/output/data/`` -will be created with the same name, appended with '.out'. - -This transform job will split CSV files by newline separators, which is especially useful if the input files are large. The Transform Job will -assemble the outputs with line separators when writing each input file's corresponding output file. - -Each payload entering the first model container will be up to six megabytes, and up to eight inference requests will be sent at the -same time to the first model container. Since each payload will consist of a mini-batch of multiple CSV records, the model -containers will transform each mini-batch of records. - -For comprehensive examples on how to use Inference Pipelines please refer to the following notebooks: - -- `inference_pipeline_sparkml_xgboost_abalone.ipynb `__ -- `inference_pipeline_sparkml_blazingtext_dbpedia.ipynb `__ - - ->>>>>>> 1787f783e2f9fc4f2144bd4b4f90281a2bb018b5 -SageMaker Workflow ------------------- - -You can use Apache Airflow to author, schedule and monitor SageMaker workflow. - -For more information, see `SageMaker Workflow in Apache Airflow`_. - -.. _SageMaker Workflow in Apache Airflow: src/sagemaker/workflow/README.rst From 09685191b8963d1b4dc259f0213bac9172b8efc1 Mon Sep 17 00:00:00 2001 From: Slesar Date: Mon, 11 Mar 2019 17:43:46 -0700 Subject: [PATCH 3/6] fix broken workflow link in main readme --- README.rst | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/README.rst b/README.rst index 36ee8969d6..4d2ba88f80 100644 --- a/README.rst +++ b/README.rst @@ -44,7 +44,7 @@ Table of Contents 16. `Secure Training and Inference with VPC `__ 17. `BYO Model `__ 18. `Inference Pipelines `__ -19. `SageMaker Workflow `__ +19. `SageMaker Workflow <#sagemaker-workflow>`__ Installing the SageMaker Python SDK @@ -293,3 +293,11 @@ For more information, see `AWS SageMaker Estimators and Models`_. .. _AWS SageMaker Estimators and Models: src/sagemaker/amazon/README.rst +SageMaker Workflow +------------------ + +You can use Apache Airflow to author, schedule and monitor SageMaker workflow. + +For more information, see `SageMaker Workflow in Apache Airflow`_. + +.. _SageMaker Workflow in Apache Airflow: src/sagemaker/workflow/README.rst From 0b305a58ad5f7acefac3858280dd87b1761007d0 Mon Sep 17 00:00:00 2001 From: Slesar Date: Mon, 11 Mar 2019 17:52:14 -0700 Subject: [PATCH 4/6] remove workflow from overview sphinx topic --- doc/overview.rst | 9 --------- 1 file changed, 9 deletions(-) diff --git a/doc/overview.rst b/doc/overview.rst index 7256c5a800..44f2ca30f2 100644 --- a/doc/overview.rst +++ b/doc/overview.rst @@ -676,12 +676,3 @@ For comprehensive examples on how to use Inference Pipelines please refer to the - `inference_pipeline_sparkml_xgboost_abalone.ipynb `__ - `inference_pipeline_sparkml_blazingtext_dbpedia.ipynb `__ - -SageMaker Workflow ------------------- - -You can use Apache Airflow to author, schedule and monitor SageMaker workflow. - -For more information, see `SageMaker Workflow in Apache Airflow`_. - -.. _SageMaker Workflow in Apache Airflow: src/sagemaker/workflow/README.rst \ No newline at end of file From bec5019c66fa2d599842956436ea1eb9f20b60a2 Mon Sep 17 00:00:00 2001 From: Slesar Date: Wed, 13 Mar 2019 09:52:46 -0700 Subject: [PATCH 5/6] fixed broken link to chainer in main readme --- README.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.rst b/README.rst index 4d2ba88f80..40073f1f17 100644 --- a/README.rst +++ b/README.rst @@ -30,7 +30,7 @@ Table of Contents 2. `Using the SageMaker Python SDK `__ 3. `MXNet SageMaker Estimators <#mxnet-sagemaker-estimators>`__ 4. `TensorFlow SageMaker Estimators <#tensorflow-sagemaker-estimators>`__ -5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators <`__ +5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators>`__ 6. `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__ 7. `Scikit-learn SageMaker Estimators <#scikit-learn-sagemaker-estimators>`__ 8. `SageMaker Reinforcement Learning Estimators <#sagemaker-reinforcement-learning-estimators>`__ From 2ccfe70dc1e3f543a0a733b1e3eeffba8d8f17ff Mon Sep 17 00:00:00 2001 From: Slesar Date: Wed, 13 Mar 2019 17:11:42 -0700 Subject: [PATCH 6/6] added workflow back in to overview.rst --- doc/overview.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/doc/overview.rst b/doc/overview.rst index 44f2ca30f2..d94a1946d0 100644 --- a/doc/overview.rst +++ b/doc/overview.rst @@ -676,3 +676,11 @@ For comprehensive examples on how to use Inference Pipelines please refer to the - `inference_pipeline_sparkml_xgboost_abalone.ipynb `__ - `inference_pipeline_sparkml_blazingtext_dbpedia.ipynb `__ +SageMaker Workflow +------------------ + +You can use Apache Airflow to author, schedule and monitor SageMaker workflow. + +For more information, see `SageMaker Workflow in Apache Airflow`_. + +.. _SageMaker Workflow in Apache Airflow: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/workflow/README.rst \ No newline at end of file