Skip to content

Remove duplicate content and add links to readthedocs #694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 14, 2019
3 changes: 2 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@ CHANGELOG
=========

1.18.5dev
======
=========

* doc-fix: Remove duplicate content from main README.rst, /tensorflow/README.rst, and /sklearn/README.rst and add links to readthedocs content
* feature: ``PipelineModel``: Create a Transformer from a PipelineModel

1.18.4
Expand Down
314 changes: 18 additions & 296 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,24 +27,24 @@ Table of Contents
-----------------

1. `Installing SageMaker Python SDK <#installing-the-sagemaker-python-sdk>`__
2. `SageMaker Python SDK Overview <#sagemaker-python-sdk-overview>`__
2. `Using the SageMaker Python SDK <https://sagemaker.readthedocs.io/en/stable/overview.html>`__
3. `MXNet SageMaker Estimators <#mxnet-sagemaker-estimators>`__
4. `TensorFlow SageMaker Estimators <#tensorflow-sagemaker-estimators>`__
5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators>`__
5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators <`__
6. `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__
7. `Scikit-learn SageMaker Estimators <#scikit-learn-sagemaker-estimators>`__
8. `SageMaker Reinforcement Learning Estimators <#sagemaker-reinforcement-learning-estimators>`__
9. `SageMaker SparkML Serving <#sagemaker-sparkml-serving>`__
10. `AWS SageMaker Estimators <#aws-sagemaker-estimators>`__
11. `Using SageMaker AlgorithmEstimators <#using-sagemaker-algorithmestimators>`__
12. `Consuming SageMaker Model Packages <#consuming-sagemaker-model-packages>`__
13. `BYO Docker Containers with SageMaker Estimators <#byo-docker-containers-with-sagemaker-estimators>`__
14. `SageMaker Automatic Model Tuning <#sagemaker-automatic-model-tuning>`__
15. `SageMaker Batch Transform <#sagemaker-batch-transform>`__
16. `Secure Training and Inference with VPC <#secure-training-and-inference-with-vpc>`__
17. `BYO Model <#byo-model>`__
18. `Inference Pipelines <#inference-pipelines>`__
19. `SageMaker Workflow <#sagemaker-workflow>`__
11. `Using SageMaker AlgorithmEstimators <https://sagemaker.readthedocs.io/en/stable/overview.html#using-sagemaker-algorithmestimators>`__
12. `Consuming SageMaker Model Packages <https://sagemaker.readthedocs.io/en/stable/overview.html#consuming-sagemaker-model-packages>`__
13. `BYO Docker Containers with SageMaker Estimators <https://sagemaker.readthedocs.io/en/stable/overview.html#byo-docker-containers-with-sagemaker-estimators>`__
14. `SageMaker Automatic Model Tuning <https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-automatic-model-tuning>`__
15. `SageMaker Batch Transform <https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-batch-transform>`__
16. `Secure Training and Inference with VPC <https://sagemaker.readthedocs.io/en/stable/overview.html#secure-training-and-inference-with-vpc>`__
17. `BYO Model <https://sagemaker.readthedocs.io/en/stable/overview.html#byo-model>`__
18. `Inference Pipelines <https://sagemaker.readthedocs.io/en/stable/overview.html#inference-pipelines>`__
19. `SageMaker Workflow <https://sagemaker.readthedocs.io/en/stable/overview.html#sagemaker-workflow>`__


Installing the SageMaker Python SDK
Expand Down Expand Up @@ -151,287 +151,6 @@ Building Sphinx docs

You can edit the templates for any of the pages in the docs by editing the .rst files in the "doc" directory and then running "``make html``" again.


SageMaker Python SDK Overview
-----------------------------

SageMaker Python SDK provides several high-level abstractions for working with Amazon SageMaker. These are:

- **Estimators**: Encapsulate training on SageMaker.
- **Models**: Encapsulate built ML models.
- **Predictors**: Provide real-time inference and transformation using Python data-types against a SageMaker endpoint.
- **Session**: Provides a collection of methods for working with SageMaker resources.

``Estimator`` and ``Model`` implementations for MXNet, TensorFlow, Chainer, PyTorch, and Amazon ML algorithms are included.
There's also an ``Estimator`` that runs SageMaker compatible custom Docker containers, enabling you to run your own ML algorithms by using the SageMaker Python SDK.

The following sections of this document explain how to use the different estimators and models:

* `MXNet SageMaker Estimators and Models <#mxnet-sagemaker-estimators>`__
* `TensorFlow SageMaker Estimators and Models <#tensorflow-sagemaker-estimators>`__
* `Chainer SageMaker Estimators and Models <#chainer-sagemaker-estimators>`__
* `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__
* `Scikit-learn SageMaker Estimators and Models <#scikit-learn-sagemaker-estimators>`__
* `SageMaker Reinforcement Learning Estimators <#sagemaker-reinforcement-learning-estimators>`__
* `AWS SageMaker Estimators and Models <#aws-sagemaker-estimators>`__
* `Custom SageMaker Estimators and Models <#byo-docker-containers-with-sagemaker-estimators>`__


Using Estimators
----------------

Here is an end to end example of how to use a SageMaker Estimator:

.. code:: python

from sagemaker.mxnet import MXNet

# Configure an MXNet Estimator (no training happens yet)
mxnet_estimator = MXNet('train.py',
role='SageMakerRole',
train_instance_type='ml.p2.xlarge',
train_instance_count=1,
framework_version='1.2.1')

# Starts a SageMaker training job and waits until completion.
mxnet_estimator.fit('s3://my_bucket/my_training_data/')

# Deploys the model that was generated by fit() to a SageMaker endpoint
mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1, instance_type='ml.p2.xlarge')

# Serializes data and makes a prediction request to the SageMaker endpoint
response = mxnet_predictor.predict(data)

# Tears down the SageMaker endpoint and endpoint configuration
mxnet_predictor.delete_endpoint()

# Deletes the SageMaker model
mxnet_predictor.delete_model()

The example above will eventually delete both the SageMaker endpoint and endpoint configuration through `delete_endpoint()`. If you want to keep your SageMaker endpoint configuration, use the value False for the `delete_endpoint_config` parameter, as shown below.

.. code:: python

# Only delete the SageMaker endpoint, while keeping the corresponding endpoint configuration.
mxnet_predictor.delete_endpoint(delete_endpoint_config=False)

Additionally, it is possible to deploy a different endpoint configuration, which links to your model, to an already existing SageMaker endpoint.
This can be done by specifying the existing endpoint name for the ``endpoint_name`` parameter along with the ``update_endpoint`` parameter as ``True`` within your ``deploy()`` call.
For more `information <https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.update_endpoint>`__.

.. code:: python

from sagemaker.mxnet import MXNet

# Configure an MXNet Estimator (no training happens yet)
mxnet_estimator = MXNet('train.py',
role='SageMakerRole',
train_instance_type='ml.p2.xlarge',
train_instance_count=1,
framework_version='1.2.1')

# Starts a SageMaker training job and waits until completion.
mxnet_estimator.fit('s3://my_bucket/my_training_data/')

# Deploys the model that was generated by fit() to an existing SageMaker endpoint
mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1,
instance_type='ml.p2.xlarge',
update_endpoint=True,
endpoint_name='existing-endpoint')

# Serializes data and makes a prediction request to the SageMaker endpoint
response = mxnet_predictor.predict(data)

# Tears down the SageMaker endpoint and endpoint configuration
mxnet_predictor.delete_endpoint()

# Deletes the SageMaker model
mxnet_predictor.delete_model()

Training Metrics
~~~~~~~~~~~~~~~~
The SageMaker Python SDK allows you to specify a name and a regular expression for metrics you want to track for training.
A regular expression (regex) matches what is in the training algorithm logs, like a search function.
Here is an example of how to define metrics:

.. code:: python

# Configure an BYO Estimator with metric definitions (no training happens yet)
byo_estimator = Estimator(image_name=image_name,
role='SageMakerRole', train_instance_count=1,
train_instance_type='ml.c4.xlarge',
sagemaker_session=sagemaker_session,
metric_definitions=[{'Name': 'test:msd', 'Regex': '#quality_metric: host=\S+, test msd <loss>=(\S+)'},
{'Name': 'test:ssd', 'Regex': '#quality_metric: host=\S+, test ssd <loss>=(\S+)'}])

All Amazon SageMaker algorithms come with built-in support for metrics.
You can go to `the AWS documentation <https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html>`__ for more details about built-in metrics of each Amazon SageMaker algorithm.

Local Mode
~~~~~~~~~~

The SageMaker Python SDK supports local mode, which allows you to create estimators and deploy them to your local environment.
This is a great way to test your deep learning scripts before running them in SageMaker's managed training or hosting environments.
Local Mode is supported for only frameworks (e.g. TensorFlow, MXNet) and images you supply yourself.

We can take the example in `Using Estimators <#using-estimators>`__ , and use either ``local`` or ``local_gpu`` as the instance type.

.. code:: python

from sagemaker.mxnet import MXNet

# Configure an MXNet Estimator (no training happens yet)
mxnet_estimator = MXNet('train.py',
role='SageMakerRole',
train_instance_type='local',
train_instance_count=1,
framework_version='1.2.1')

# In Local Mode, fit will pull the MXNet container Docker image and run it locally
mxnet_estimator.fit('s3://my_bucket/my_training_data/')

# Alternatively, you can train using data in your local file system. This is only supported in Local mode.
mxnet_estimator.fit('file:///tmp/my_training_data')

# Deploys the model that was generated by fit() to local endpoint in a container
mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1, instance_type='local')

# Serializes data and makes a prediction request to the local endpoint
response = mxnet_predictor.predict(data)

# Tears down the endpoint container and deletes the corresponding endpoint configuration
mxnet_predictor.delete_endpoint()

# Deletes the model
mxnet_predictor.delete_model()


If you have an existing model and want to deploy it locally, don't specify a sagemaker_session argument to the ``MXNetModel`` constructor.
The correct session is generated when you call ``model.deploy()``.

Here is an end-to-end example:

.. code:: python

import numpy
from sagemaker.mxnet import MXNetModel

model_location = 's3://mybucket/my_model.tar.gz'
code_location = 's3://mybucket/sourcedir.tar.gz'
s3_model = MXNetModel(model_data=model_location, role='SageMakerRole',
entry_point='mnist.py', source_dir=code_location)

predictor = s3_model.deploy(initial_instance_count=1, instance_type='local')
data = numpy.zeros(shape=(1, 1, 28, 28))
predictor.predict(data)

# Tear down the endpoint container and delete the corresponding endpoint configuration
predictor.delete_endpoint()

# Deletes the model
predictor.delete_model()


If you don't want to deploy your model locally, you can also choose to perform a Local Batch Transform Job. This is
useful if you want to test your container before creating a Sagemaker Batch Transform Job. Note that the performance
will not match Batch Transform Jobs hosted on SageMaker but it is still a useful tool to ensure you have everything
right or if you are not dealing with huge amounts of data.

Here is an end-to-end example:

.. code:: python

from sagemaker.mxnet import MXNet

mxnet_estimator = MXNet('train.py',
train_instance_type='local',
train_instance_count=1,
framework_version='1.2.1')

mxnet_estimator.fit('file:///tmp/my_training_data')
transformer = mxnet_estimator.transformer(1, 'local', assemble_with='Line', max_payload=1)
transformer.transform('s3://my/transform/data, content_type='text/csv', split_type='Line')
transformer.wait()

# Deletes the SageMaker model
transformer.delete_model()


For detailed examples of running Docker in local mode, see:

- `TensorFlow local mode example notebook <https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/tensorflow_distributed_mnist/tensorflow_local_mode_mnist.ipynb>`__.
- `MXNet local mode example notebook <https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/mxnet_gluon_mnist/mnist_with_gluon_local_mode.ipynb>`__.

A few important notes:

- Only one local mode endpoint can be running at a time.
- If you are using S3 data as input, it is pulled from S3 to your local environment. Ensure you have sufficient space to store the data locally.
- If you run into problems it often due to different Docker containers conflicting. Killing these containers and re-running often solves your problems.
- Local Mode requires Docker Compose and `nvidia-docker2 <https://github.com/NVIDIA/nvidia-docker>`__ for ``local_gpu``.
- Distributed training is not yet supported for ``local_gpu``.

Incremental Training
~~~~~~~~~~~~~~~~~~~~

Incremental training allows you to bring a pre-trained model into a SageMaker training job and use it as a starting point for a new model.
There are several situations where you might want to do this:

- You want to perform additional training on a model to improve its fit on your data set.
- You want to import a pre-trained model and fit it to your data.
- You want to resume a training job that you previously stopped.

To use incremental training with SageMaker algorithms, you need model artifacts compressed into a ``tar.gz`` file. These
artifacts are passed to a training job via an input channel configured with the pre-defined settings Amazon SageMaker algorithms require.

To use model files with a SageMaker estimator, you can use the following parameters:

* ``model_uri``: points to the location of a model tarball, either in S3 or locally. Specifying a local path only works in local mode.
* ``model_channel_name``: name of the channel SageMaker will use to download the tarball specified in ``model_uri``. Defaults to 'model'.

This is converted into an input channel with the specifications mentioned above once you call ``fit()`` on the predictor.
In bring-your-own cases, ``model_channel_name`` can be overriden if you require to change the name of the channel while using
the same settings.

If your bring-your-own case requires different settings, you can create your own ``s3_input`` object with the settings you require.

Here's an example of how to use incremental training:

.. code:: python

# Configure an estimator
estimator = sagemaker.estimator.Estimator(training_image,
role,
train_instance_count=1,
train_instance_type='ml.p2.xlarge',
train_volume_size=50,
train_max_run=360000,
input_mode='File',
output_path=s3_output_location)

# Start a SageMaker training job and waits until completion.
estimator.fit('s3://my_bucket/my_training_data/')

# Create a new estimator using the previous' model artifacts
incr_estimator = sagemaker.estimator.Estimator(training_image,
role,
train_instance_count=1,
train_instance_type='ml.p2.xlarge',
train_volume_size=50,
train_max_run=360000,
input_mode='File',
output_path=s3_output_location,
model_uri=estimator.model_data)

# Start a SageMaker training job using the original model for incremental training
incr_estimator.fit('s3://my_bucket/my_training_data/')

Currently, the following algorithms support incremental training:

- Image Classification
- Object Detection
- Semantic Segmentation


MXNet SageMaker Estimators
--------------------------

Expand Down Expand Up @@ -459,9 +178,9 @@ Supported versions of TensorFlow for Elastic Inference: ``1.11.0``, ``1.12.0``.

We recommend that you use the latest supported version, because that's where we focus most of our development efforts.

For more information, see `TensorFlow SageMaker Estimators and Models`_.
For more information, see `Using TensorFlow with the SageMaker Python SDK`_.

.. _TensorFlow SageMaker Estimators and Models: src/sagemaker/tensorflow/README.rst
.. _Using TensorFlow with the SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/using_tf.html


Chainer SageMaker Estimators
Expand Down Expand Up @@ -507,9 +226,9 @@ We recommend that you use the latest supported version, because that's where we

For more information about Scikit-learn, see https://scikit-learn.org/stable/

For more information about Scikit-learn SageMaker ``Estimators``, see `Scikit-learn SageMaker Estimators and Models`_.
For more information about Scikit-learn SageMaker ``Estimators``, see `Using Scikit-learn with the SageMaker Python SDK`_.

.. _Scikit-learn SageMaker Estimators and Models: src/sagemaker/sklearn/README.rst
.. _Using Scikit-learn with the SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/using_sklearn.html


SageMaker Reinforcement Learning Estimators
Expand Down Expand Up @@ -574,6 +293,8 @@ For more information, see `AWS SageMaker Estimators and Models`_.

.. _AWS SageMaker Estimators and Models: src/sagemaker/amazon/README.rst

<<<<<<< HEAD
=======
Using SageMaker AlgorithmEstimators
-----------------------------------

Expand Down Expand Up @@ -1016,6 +737,7 @@ For comprehensive examples on how to use Inference Pipelines please refer to the
- `inference_pipeline_sparkml_blazingtext_dbpedia.ipynb <https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/inference_pipeline_sparkml_blazingtext_dbpedia/inference_pipeline_sparkml_blazingtext_dbpedia.ipynb>`__


>>>>>>> 1787f783e2f9fc4f2144bd4b4f90281a2bb018b5
SageMaker Workflow
------------------

Expand Down
Loading