From e6270fe46d30fa517b00fd3c9bd058e933c9b5ee Mon Sep 17 00:00:00 2001
From: Slesar <eslesar@amazon.com>
Date: Fri, 1 Mar 2019 16:40:13 -0800
Subject: [PATCH 1/2] move content from sklearn readme into sphynx project

---
 CHANGELOG.rst         |   1 +
 doc/index.rst         |   5 +
 doc/using_sklearn.rst | 638 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 644 insertions(+)
 create mode 100644 doc/using_sklearn.rst

diff --git a/CHANGELOG.rst b/CHANGELOG.rst
index 5dced081ce..9f1132ec59 100644
--- a/CHANGELOG.rst
+++ b/CHANGELOG.rst
@@ -14,6 +14,7 @@ CHANGELOG
 * doc-fix: move overview content in main README into sphynx project
 * bug-fix: pass accelerator_type in ``deploy`` for REST API TFS ``Model``
 * doc-fix: move content from tf/README.rst into sphynx project
+* doc-fix: move content from slearn/README.rst into sphynx project
 
 1.18.3.post1
 ============
diff --git a/doc/index.rst b/doc/index.rst
index 309cdb69b3..62e49d76d8 100644
--- a/doc/index.rst
+++ b/doc/index.rst
@@ -65,6 +65,11 @@ Scikit-Learn
 ************
 A managed enrionment for Scikit-learn training and hosting on Amazon SageMaker
 
+.. toctree::
+    :maxdepth: 1
+
+    using_sklearn
+
 .. toctree::
     :maxdepth: 2
 
diff --git a/doc/using_sklearn.rst b/doc/using_sklearn.rst
new file mode 100644
index 0000000000..9d80f1fbad
--- /dev/null
+++ b/doc/using_sklearn.rst
@@ -0,0 +1,638 @@
+================================================
+Using Scikit-learn with the SageMaker Python SDK
+================================================
+
+.. contents::
+
+With Scikit-learn Estimators, you can train and host Scikit-learn models on Amazon SageMaker.
+
+Supported versions of Scikit-learn: ``0.20.0``
+
+You can visit the Scikit-learn repository at https://github.com/scikit-learn/scikit-learn.
+
+
+Training with Scikit-learn
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Training Scikit-learn models using ``SKLearn`` Estimators is a two-step process:
+
+1. Prepare a Scikit-learn script to run on SageMaker
+2. Run this script on SageMaker via a ``SKLearn`` Estimator.
+
+
+First, you prepare your training script, then second, you run this on SageMaker via a ``SKLearn`` Estimator.
+You should prepare your script in a separate source file than the notebook, terminal session, or source file you're
+using to submit the script to SageMaker via a ``SKLearn`` Estimator.
+
+Suppose that you already have an Scikit-learn training script called
+``sklearn-train.py``. You can run this script in SageMaker as follows:
+
+.. code:: python
+
+    from sagemaker.sklearn import SKLearn
+    sklearn_estimator = SKLearn(entry_point='sklearn-train.py',
+                                role='SageMakerRole',
+                                train_instance_type='ml.m4.xlarge',
+                                framework_version='0.20.0')
+    sklearn_estimator.fit('s3://bucket/path/to/training/data')
+
+Where the S3 URL is a path to your training data, within Amazon S3. The constructor keyword arguments define how
+SageMaker runs your training script and are discussed in detail in a later section.
+
+In the following sections, we'll discuss how to prepare a training script for execution on SageMaker,
+then how to run that script on SageMaker using a ``SKLearn`` Estimator.
+
+Preparing the Scikit-learn training script
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Your Scikit-learn training script must be a Python 2.7 or 3.5 compatible source file.
+
+The training script is very similar to a training script you might run outside of SageMaker, but you
+can access useful properties about the training environment through various environment variables, such as
+
+* ``SM_MODEL_DIR``: A string representing the path to the directory to write model artifacts to.
+  These artifacts are uploaded to S3 for model hosting.
+* ``SM_OUTPUT_DATA_DIR``: A string representing the filesystem path to write output artifacts to. Output artifacts may
+  include checkpoints, graphs, and other files to save, not including model artifacts. These artifacts are compressed
+  and uploaded to S3 to the same S3 prefix as the model artifacts.
+
+Supposing two input channels, 'train' and 'test', were used in the call to the Scikit-learn estimator's ``fit()`` method,
+the following will be set, following the format "SM_CHANNEL_[channel_name]":
+
+* ``SM_CHANNEL_TRAIN``: A string representing the path to the directory containing data in the 'train' channel
+* ``SM_CHANNEL_TEST``: Same as above, but for the 'test' channel.
+
+A typical training script loads data from the input channels, configures training with hyperparameters, trains a model,
+and saves a model to model_dir so that it can be hosted later. Hyperparameters are passed to your script as arguments
+and can be retrieved with an argparse.ArgumentParser instance. For example, a training script might start
+with the following:
+
+.. code:: python
+
+    import argparse
+    import os
+
+    if __name__ =='__main__':
+
+        parser = argparse.ArgumentParser()
+
+        # hyperparameters sent by the client are passed as command-line arguments to the script.
+        parser.add_argument('--epochs', type=int, default=50)
+        parser.add_argument('--batch-size', type=int, default=64)
+        parser.add_argument('--learning-rate', type=float, default=0.05)
+
+        # Data, model, and output directories
+        parser.add_argument('--output-data-dir', type=str, default=os.environ.get('SM_OUTPUT_DATA_DIR'))
+        parser.add_argument('--model-dir', type=str, default=os.environ.get('SM_MODEL_DIR'))
+        parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
+        parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST'))
+
+        args, _ = parser.parse_known_args()
+
+        # ... load from args.train and args.test, train a model, write model to args.model_dir.
+
+Because the SageMaker imports your training script, you should put your training code in a main guard
+(``if __name__=='__main__':``) if you are using the same script to host your model, so that SageMaker does not
+inadvertently run your training code at the wrong point in execution.
+
+For more on training environment variables, please visit https://github.com/aws/sagemaker-containers.
+
+Running a Scikit-learn training script in SageMaker
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You run Scikit-learn training scripts on SageMaker by creating ``SKLearn`` Estimators.
+SageMaker training of your script is invoked when you call ``fit`` on a ``SKLearn`` Estimator.
+The following code sample shows how you train a custom Scikit-learn script "sklearn-train.py", passing
+in three hyperparameters ('epochs', 'batch-size', and 'learning-rate'), and using two input channel
+directories ('train' and 'test').
+
+.. code:: python
+
+    sklearn_estimator = SKLearn('sklearn-train.py',
+                                train_instance_type='ml.m4.xlarge',
+                                framework_version='0.20.0',
+                                hyperparameters = {'epochs': 20, 'batch-size': 64, 'learning-rate': 0.1})
+    sklearn_estimator.fit({'train': 's3://my-data-bucket/path/to/my/training/data',
+                            'test': 's3://my-data-bucket/path/to/my/test/data'})
+
+
+Scikit-learn Estimators
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The `SKLearn` constructor takes both required and optional arguments.
+
+Required arguments
+''''''''''''''''''
+
+The following are required arguments to the ``SKLearn`` constructor. When you create a Scikit-learn object, you must
+include these in the constructor, either positionally or as keyword arguments.
+
+-  ``entry_point`` Path (absolute or relative) to the Python file which
+   should be executed as the entry point to training.
+-  ``role`` An AWS IAM role (either name or full ARN). The Amazon
+   SageMaker training jobs and APIs that create Amazon SageMaker
+   endpoints use this role to access training data and model artifacts.
+   After the endpoint is created, the inference code might use the IAM
+   role, if accessing AWS resource.
+-  ``train_instance_type`` Type of EC2 instance to use for training, for
+   example, 'ml.m4.xlarge'. Please note that Scikit-learn does not have GPU support.
+
+Optional arguments
+''''''''''''''''''
+
+The following are optional arguments. When you create a ``SKLearn`` object, you can specify these as keyword arguments.
+
+-  ``source_dir`` Path (absolute or relative) to a directory with any
+   other training source code dependencies including the entry point
+   file. Structure within this directory will be preserved when training
+   on SageMaker.
+-  ``hyperparameters`` Hyperparameters that will be used for training.
+   Will be made accessible as a dict[str, str] to the training code on
+   SageMaker. For convenience, accepts other types besides str, but
+   str() will be called on keys and values to convert them before
+   training.
+-  ``py_version`` Python version you want to use for executing your
+   model training code.
+-  ``train_volume_size`` Size in GB of the EBS volume to use for storing
+   input data during training. Must be large enough to store training
+   data if input_mode='File' is used (which is the default).
+-  ``train_max_run`` Timeout in seconds for training, after which Amazon
+   SageMaker terminates the job regardless of its current status.
+-  ``input_mode`` The input mode that the algorithm supports. Valid
+   modes: 'File' - Amazon SageMaker copies the training dataset from the
+   s3 location to a directory in the Docker container. 'Pipe' - Amazon
+   SageMaker streams data directly from s3 to the container via a Unix
+   named pipe.
+-  ``output_path`` s3 location where you want the training result (model
+   artifacts and optional output files) saved. If not specified, results
+   are stored to a default bucket. If the bucket with the specific name
+   does not exist, the estimator creates the bucket during the fit()
+   method execution.
+-  ``output_kms_key`` Optional KMS key ID to optionally encrypt training
+   output with.
+-  ``job_name`` Name to assign for the training job that the fit()
+   method launches. If not specified, the estimator generates a default
+   job name, based on the training image name and current timestamp
+-  ``image_name`` An alternative docker image to use for training and
+   serving.  If specified, the estimator will use this image for training and
+   hosting, instead of selecting the appropriate SageMaker official image based on
+   framework_version and py_version. Refer to: `SageMaker Scikit-learn Docker Containers
+   <#sagemaker-scikit-learn-docker-containers>`_ for details on what the official images support
+   and where to find the source code to build your custom image.
+
+
+Calling fit
+^^^^^^^^^^^
+
+You start your training script by calling ``fit`` on a ``SKLearn`` Estimator. ``fit`` takes both required and optional
+arguments.
+
+Required arguments
+''''''''''''''''''
+
+-  ``inputs``: This can take one of the following forms: A string
+   s3 URI, for example ``s3://my-bucket/my-training-data``. In this
+   case, the s3 objects rooted at the ``my-training-data`` prefix will
+   be available in the default ``train`` channel. A dict from
+   string channel names to s3 URIs. In this case, the objects rooted at
+   each s3 prefix will available as files in each channel directory.
+
+For example:
+
+.. code:: python
+
+    {'train':'s3://my-bucket/my-training-data',
+     'eval':'s3://my-bucket/my-evaluation-data'}
+
+.. optional-arguments-1:
+
+Optional arguments
+''''''''''''''''''
+
+-  ``wait``: Defaults to True, whether to block and wait for the
+   training script to complete before returning.
+-  ``logs``: Defaults to True, whether to show logs produced by training
+   job in the Python session. Only meaningful when wait is True.
+
+
+Saving models
+~~~~~~~~~~~~~
+
+In order to save your trained Scikit-learn model for deployment on SageMaker, your training script should save your
+model to a certain filesystem path called `model_dir`. This value is accessible through the environment variable
+``SM_MODEL_DIR``. The following code demonstrates how to save a trained Scikit-learn model named ``model`` as
+``model.joblib`` at the end of training:
+
+.. code:: python
+
+    from sklearn.externals import joblib
+    import argparse
+    import os
+
+    if __name__=='__main__':
+        # default to the value in environment variable `SM_MODEL_DIR`. Using args makes the script more portable.
+        parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR'])
+        args, _ = parser.parse_known_args()
+
+        # ... train classifier `clf`, then save it to `model_dir` as file 'model.joblib'
+        joblib.dump(clf, os.path.join(args.model_dir, "model.joblib"))
+
+After your training job is complete, SageMaker will compress and upload the serialized model to S3, and your model data
+will available in the s3 ``output_path`` you specified when you created the Scikit-learn Estimator.
+
+Deploying Scikit-learn models
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+After an Scikit-learn Estimator has been fit, you can host the newly created model in SageMaker.
+
+After calling ``fit``, you can call ``deploy`` on a ``SKLearn`` Estimator to create a SageMaker Endpoint.
+The Endpoint runs a SageMaker-provided Scikit-learn model server and hosts the model produced by your training script,
+which was run when you called ``fit``. This was the model you saved to ``model_dir``.
+
+``deploy`` returns a ``Predictor`` object, which you can use to do inference on the Endpoint hosting your Scikit-learn
+model. Each ``Predictor`` provides a ``predict`` method which can do inference with numpy arrays or Python lists.
+Inference arrays or lists are serialized and sent to the Scikit-learn model server by an ``InvokeEndpoint`` SageMaker
+operation.
+
+``predict`` returns the result of inference against your model. By default, the inference result a NumPy array.
+
+.. code:: python
+
+    # Train my estimator
+    sklearn_estimator = SKLearn(entry_point='train_and_deploy.py',
+                                train_instance_type='ml.m4.xlarge',
+                                framework_version='0.20.0')
+    sklearn_estimator.fit('s3://my_bucket/my_training_data/')
+
+    # Deploy my estimator to a SageMaker Endpoint and get a Predictor
+    predictor = sklearn_estimator.deploy(instance_type='ml.m4.xlarge',
+                                         initial_instance_count=1)
+
+    # `data` is a NumPy array or a Python list.
+    # `response` is a NumPy array.
+    response = predictor.predict(data)
+
+You use the SageMaker Scikit-learn model server to host your Scikit-learn model when you call ``deploy``
+on an ``SKLearn`` Estimator. The model server runs inside a SageMaker Endpoint, which your call to ``deploy`` creates.
+You can access the name of the Endpoint by the ``name`` property on the returned ``Predictor``.
+
+
+SageMaker Scikit-learn Model Server
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The Scikit-learn Endpoint you create with ``deploy`` runs a SageMaker Scikit-learn model server.
+The model server loads the model that was saved by your training script and performs inference on the model in response
+to SageMaker InvokeEndpoint API calls.
+
+You can configure two components of the SageMaker Scikit-learn model server: Model loading and model serving.
+Model loading is the process of deserializing your saved model back into an Scikit-learn model.
+Serving is the process of translating InvokeEndpoint requests to inference calls on the loaded model.
+
+You configure the Scikit-learn model server by defining functions in the Python source file you passed to the
+Scikit-learn constructor.
+
+Model loading
+^^^^^^^^^^^^^
+
+Before a model can be served, it must be loaded. The SageMaker Scikit-learn model server loads your model by invoking a
+``model_fn`` function that you must provide in your script. The ``model_fn`` should have the following signature:
+
+.. code:: python
+
+    def model_fn(model_dir)
+
+SageMaker will inject the directory where your model files and sub-directories, saved by ``save``, have been mounted.
+Your model function should return a model object that can be used for model serving.
+
+SageMaker provides automated serving functions that work with Gluon API ``net`` objects and Module API ``Module`` objects. If you return either of these types of objects, then you will be able to use the default serving request handling functions.
+
+The following code-snippet shows an example ``model_fn`` implementation.
+This loads returns a Scikit-learn Classifier from a ``model.joblib`` file in the SageMaker model directory
+``model_dir``.
+
+.. code:: python
+
+    from sklearn.externals import joblib
+    import os
+
+    def model_fn(model_dir):
+        clf = joblib.load(os.path.join(model_dir, "model.joblib"))
+        return clf
+
+Model serving
+^^^^^^^^^^^^^
+
+After the SageMaker model server has loaded your model by calling ``model_fn``, SageMaker will serve your model.
+Model serving is the process of responding to inference requests, received by SageMaker InvokeEndpoint API calls.
+The SageMaker Scikit-learn model server breaks request handling into three steps:
+
+
+-  input processing,
+-  prediction, and
+-  output processing.
+
+In a similar way to model loading, you configure these steps by defining functions in your Python source file.
+
+Each step involves invoking a python function, with information about the request and the return-value from the previous
+function in the chain. Inside the SageMaker Scikit-learn model server, the process looks like:
+
+.. code:: python
+
+    # Deserialize the Invoke request body into an object we can perform prediction on
+    input_object = input_fn(request_body, request_content_type)
+
+    # Perform prediction on the deserialized object, with the loaded model
+    prediction = predict_fn(input_object, model)
+
+    # Serialize the prediction result into the desired response content type
+    output = output_fn(prediction, response_content_type)
+
+The above code-sample shows the three function definitions:
+
+-  ``input_fn``: Takes request data and deserializes the data into an
+   object for prediction.
+-  ``predict_fn``: Takes the deserialized request object and performs
+   inference against the loaded model.
+-  ``output_fn``: Takes the result of prediction and serializes this
+   according to the response content type.
+
+The SageMaker Scikit-learn model server provides default implementations of these functions.
+You can provide your own implementations for these functions in your hosting script.
+If you omit any definition then the SageMaker Scikit-learn model server will use its default implementation for that
+function.
+
+The ``RealTimePredictor`` used by Scikit-learn in the SageMaker Python SDK serializes NumPy arrays to the `NPY <https://docs.scipy.org/doc/numpy/neps/npy-format.html>`_ format
+by default, with Content-Type ``application/x-npy``. The SageMaker Scikit-learn model server can deserialize NPY-formatted
+data (along with JSON and CSV data).
+
+If you rely solely on the SageMaker Scikit-learn model server defaults, you get the following functionality:
+
+-  Prediction on models that implement the ``__call__`` method
+-  Serialization and deserialization of NumPy arrays.
+
+The default ``input_fn`` and ``output_fn`` are meant to make it easy to predict on NumPy arrays. If your model expects
+a NumPy array and returns a NumPy array, then these functions do not have to be overridden when sending NPY-formatted
+data.
+
+In the following sections we describe the default implementations of input_fn, predict_fn, and output_fn.
+We describe the input arguments and expected return types of each, so you can define your own implementations.
+
+Input processing
+''''''''''''''''
+
+When an InvokeEndpoint operation is made against an Endpoint running a SageMaker Scikit-learn model server,
+the model server receives two pieces of information:
+
+-  The request Content-Type, for example "application/x-npy"
+-  The request data body, a byte array
+
+The SageMaker Scikit-learn model server will invoke an "input_fn" function in your hosting script,
+passing in this information. If you define an ``input_fn`` function definition,
+it should return an object that can be passed to ``predict_fn`` and have the following signature:
+
+.. code:: python
+
+    def input_fn(request_body, request_content_type)
+
+Where ``request_body`` is a byte buffer and ``request_content_type`` is a Python string
+
+The SageMaker Scikit-learn model server provides a default implementation of ``input_fn``.
+This function deserializes JSON, CSV, or NPY encoded data into a NumPy array.
+
+Default NPY deserialization requires ``request_body`` to follow the `NPY <https://docs.scipy.org/doc/numpy/neps/npy-format.html>`_ format. For Scikit-learn, the Python SDK
+defaults to sending prediction requests with this format.
+
+Default json deserialization requires ``request_body`` contain a single json list.
+Sending multiple json objects within the same ``request_body`` is not supported.
+The list must have a dimensionality compatible with the model loaded in ``model_fn``.
+The list's shape must be identical to the model's input shape, for all dimensions after the first (which first
+dimension is the batch size).
+
+Default csv deserialization requires ``request_body`` contain one or more lines of CSV numerical data.
+The data is loaded into a two-dimensional array, where each line break defines the boundaries of the first dimension.
+
+The example below shows a custom ``input_fn`` for preparing pickled NumPy arrays.
+
+.. code:: python
+
+    import numpy as np
+
+    def input_fn(request_body, request_content_type):
+        """An input_fn that loads a pickled numpy array"""
+        if request_content_type == "application/python-pickle":
+            array = np.load(StringIO(request_body))
+            return array
+        else:
+            # Handle other content-types here or raise an Exception
+            # if the content type is not supported.
+            pass
+
+
+
+Prediction
+''''''''''
+
+After the inference request has been deserialized by ``input_fn``, the SageMaker Scikit-learn model server invokes
+``predict_fn`` on the return value of ``input_fn``.
+
+As with ``input_fn``, you can define your own ``predict_fn`` or use the SageMaker Scikit-learn model server default.
+
+The ``predict_fn`` function has the following signature:
+
+.. code:: python
+
+    def predict_fn(input_object, model)
+
+Where ``input_object`` is the object returned from ``input_fn`` and
+``model`` is the model loaded by ``model_fn``.
+
+The default implementation of ``predict_fn`` invokes the loaded model's ``__call__`` function on ``input_object``,
+and returns the resulting value. The return-type should be a NumPy array to be compatible with the default
+``output_fn``.
+
+The example below shows an overridden ``predict_fn`` for a Logistic Regression classifier. This model accepts a
+Python list and returns a tuple of predictions and prediction probabilities from the model in a NumPy array.
+This ``predict_fn`` can rely on the default ``input_fn`` and ``output_fn`` because ``input_data`` is a NumPy array,
+and the return value of this function is a NumPy array.
+
+.. code:: python
+
+    import sklearn
+    import numpy as np
+
+    def predict_fn(input_data, model):
+        prediction = model.predict(input_data)
+        pred_prob = model.predict_proba(input_data)
+        return np.array([prediction, pred_prob])
+
+If you implement your own prediction function, you should take care to ensure that:
+
+-  The first argument is expected to be the return value from input_fn.
+   If you use the default input_fn, this will be a NumPy array.
+-  The second argument is the loaded model.
+-  The return value should be of the correct type to be passed as the
+   first argument to ``output_fn``. If you use the default
+   ``output_fn``, this should be a NumPy array.
+
+Output processing
+'''''''''''''''''
+
+After invoking ``predict_fn``, the model server invokes ``output_fn``, passing in the return-value from ``predict_fn``
+and the InvokeEndpoint requested response content-type.
+
+The ``output_fn`` has the following signature:
+
+.. code:: python
+
+    def output_fn(prediction, content_type)
+
+Where ``prediction`` is the result of invoking ``predict_fn`` and
+``content_type`` is the InvokeEndpoint requested response content-type.
+The function should return a byte array of data serialized to content_type.
+
+The default implementation expects ``prediction`` to be an NumPy and can serialize the result to JSON, CSV, or NPY.
+It accepts response content types of "application/json", "text/csv", and "application/x-npy".
+
+Working with existing model data and training jobs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Attaching to existing training jobs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+You can attach an Scikit-learn Estimator to an existing training job using the
+``attach`` method.
+
+.. code:: python
+
+    my_training_job_name = "MyAwesomeSKLearnTrainingJob"
+    sklearn_estimator = SKLearn.attach(my_training_job_name)
+
+After attaching, if the training job is in a Complete status, it can be
+``deploy``\ ed to create a SageMaker Endpoint and return a
+``Predictor``. If the training job is in progress,
+attach will block and display log messages from the training job, until the training job completes.
+
+The ``attach`` method accepts the following arguments:
+
+-  ``training_job_name (str):`` The name of the training job to attach
+   to.
+-  ``sagemaker_session (sagemaker.Session or None):`` The Session used
+   to interact with SageMaker
+
+Deploying Endpoints from model data
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+As well as attaching to existing training jobs, you can deploy models directly from model data in S3.
+The following code sample shows how to do this, using the ``SKLearnModel`` class.
+
+.. code:: python
+
+    sklearn_model = SKLearnModel(model_data="s3://bucket/model.tar.gz", role="SageMakerRole",
+        entry_point="transform_script.py")
+
+    predictor = sklearn_model.deploy(instance_type="ml.c4.xlarge", initial_instance_count=1)
+
+The sklearn_model constructor takes the following arguments:
+
+-  ``model_data (str):`` An S3 location of a SageMaker model data
+   .tar.gz file
+-  ``image (str):`` A Docker image URI
+-  ``role (str):`` An IAM role name or Arn for SageMaker to access AWS
+   resources on your behalf.
+-  ``predictor_cls (callable[string,sagemaker.Session]):`` A function to
+   call to create a predictor. If not None, ``deploy`` will return the
+   result of invoking this function on the created endpoint name
+-  ``env (dict[string,string]):`` Environment variables to run with
+   ``image`` when hosted in SageMaker.
+-  ``name (str):`` The model name. If None, a default model name will be
+   selected on each ``deploy.``
+-  ``entry_point (str):`` Path (absolute or relative) to the Python file
+   which should be executed as the entry point to model hosting.
+-  ``source_dir (str):`` Optional. Path (absolute or relative) to a
+   directory with any other training source code dependencies including
+   tne entry point file. Structure within this directory will be
+   preserved when training on SageMaker.
+-  ``enable_cloudwatch_metrics (boolean):`` Optional. If true, training
+   and hosting containers will generate Cloudwatch metrics under the
+   AWS/SageMakerContainer namespace.
+-  ``container_log_level (int):`` Log level to use within the container.
+   Valid values are defined in the Python logging module.
+-  ``code_location (str):`` Optional. Name of the S3 bucket where your
+   custom code will be uploaded to. If not specified, will use the
+   SageMaker default bucket created by sagemaker.Session.
+-  ``sagemaker_session (sagemaker.Session):`` The SageMaker Session
+   object, used for SageMaker interaction"""
+
+Your model data must be a .tar.gz file in S3. SageMaker Training Job model data is saved to .tar.gz files in S3,
+however if you have local data you want to deploy, you can prepare the data yourself.
+
+Assuming you have a local directory containg your model data named "my_model" you can tar and gzip compress the file and
+upload to S3 using the following commands:
+
+::
+
+    tar -czf model.tar.gz my_model
+    aws s3 cp model.tar.gz s3://my-bucket/my-path/model.tar.gz
+
+This uploads the contents of my_model to a gzip compressed tar file to S3 in the bucket "my-bucket", with the key
+"my-path/model.tar.gz".
+
+To run this command, you'll need the aws cli tool installed. Please refer to our `FAQ <#FAQ>`__ for more information on
+installing this.
+
+Scikit-learn Training Examples
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Amazon provides an example Jupyter notebook that demonstrate end-to-end training on Amazon SageMaker using Scikit-learn.
+Please refer to:
+
+https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk
+
+These are also available in SageMaker Notebook Instance hosted Jupyter notebooks under the "sample notebooks" folder.
+
+
+SageMaker Scikit-learn Docker Containers
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When training and deploying training scripts, SageMaker runs your Python script in a Docker container with several
+libraries installed. When creating the Estimator and calling deploy to create the SageMaker Endpoint, you can control
+the environment your script runs in.
+
+SageMaker runs Scikit-learn Estimator scripts in either Python 2.7 or Python 3.5. You can select the Python version by
+passing a py_version keyword arg to the Scikit-learn Estimator constructor. Setting this to py3 (the default) will cause
+your training script to be run on Python 3.5. Setting this to py2 will cause your training script to be run on Python 2.7
+This Python version applies to both the Training Job, created by fit, and the Endpoint, created by deploy.
+
+The Scikit-learn Docker images have the following dependencies installed:
+
++-----------------------------+-------------+
+| Dependencies                | sklearn 0.2 |
++-----------------------------+-------------+
+| sklearn                     | 0.20.0      |
++-----------------------------+-------------+
+| sagemaker                   | 1.11.3      |
++-----------------------------+-------------+
+| sagemaker-containers        | 2.2.4       |
++-----------------------------+-------------+
+| numpy                       | 1.15.2      |
++-----------------------------+-------------+
+| pandas                      | 0.23.4      |
++-----------------------------+-------------+
+| Pillow                      | 3.1.2       |
++-----------------------------+-------------+
+| Python                      | 2.7 or 3.5  |
++-----------------------------+-------------+
+
+You can see the full list by calling ``pip freeze`` from the running Docker image.
+
+The Docker images extend Ubuntu 16.04.
+
+You can select version of Scikit-learn by passing a framework_version keyword arg to the Scikit-learn Estimator constructor.
+Currently supported versions are listed in the above table. You can also set framework_version to only specify major and
+minor version, which will cause your training script to be run on the latest supported patch version of that minor
+version.
+
+Alternatively, you can build your own image by following the instructions in the SageMaker Scikit-learn containers
+repository, and passing ``image_name`` to the Scikit-learn Estimator constructor.
+sagemaker-containers
+You can visit the SageMaker Scikit-learn containers repository here: https://github.com/aws/sagemaker-scikit-learn-container/

From 60484d4f292cadacce82e80c9918c8670811e19b Mon Sep 17 00:00:00 2001
From: Slesar <eslesar@amazon.com>
Date: Mon, 4 Mar 2019 09:37:00 -0800
Subject: [PATCH 2/2] fixed typo in changelog

---
 CHANGELOG.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/CHANGELOG.rst b/CHANGELOG.rst
index 9f1132ec59..8c9e87b355 100644
--- a/CHANGELOG.rst
+++ b/CHANGELOG.rst
@@ -14,7 +14,7 @@ CHANGELOG
 * doc-fix: move overview content in main README into sphynx project
 * bug-fix: pass accelerator_type in ``deploy`` for REST API TFS ``Model``
 * doc-fix: move content from tf/README.rst into sphynx project
-* doc-fix: move content from slearn/README.rst into sphynx project
+* doc-fix: move content from sklearn/README.rst into sphynx project
 
 1.18.3.post1
 ============