Skip to content

Add model parameters to Estimator, and bump library version to 1.13.0 #450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 1, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@
CHANGELOG
=========

1.13.0
======

* feature: Estimator: add input mode to training channels
* feature: Estimator: add model_uri and model_channel_name parameters

1.12.0
======

Expand Down
60 changes: 60 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,66 @@ A few important notes:
- Local Mode requires Docker Compose and `nvidia-docker2 <https://github.com/NVIDIA/nvidia-docker>`__ for ``local_gpu``.
- Distributed training is not yet supported for ``local_gpu``.

Incremental Training
~~~~~~~~~~~~~~~~~~~~

Incremental training allows you to bring a pre-trained model into a SageMaker training job and use it as a starting point for a new model.
There are several situations where you might want to do this:

- You want to perform additional training on a model to improve its fit on your data set.
- You want to import a pre-trained model and fit it to your data.
- You want to resume a training job that you previously stopped.

To use incremental training with SageMaker algorithms, you need model artifacts compressed into a ``tar.gz`` file. These
artifacts are passed to a training job via an input channel configured with the pre-defined settings Amazon SageMaker algorithms require.

To use model files with a SageMaker estimator, you can use the following parameters:

* ``model_uri``: points to the location of a model tarball, either in S3 or locally. Specifying a local path only works in local mode.
* ``model_channel_name``: name of the channel SageMaker will use to download the tarball specified in ``model_uri``. Defaults to 'model'.

This is converted into an input channel with the specifications mentioned above once you call ``fit()`` on the predictor.
In bring-your-own cases, ``model_channel_name`` can be overriden if you require to change the name of the channel while using
the same settings.

If your bring-your-own case requires different settings, you can create your own ``s3_input`` object with the settings you require.

Here's an example of how to use incremental training:

.. code:: python
# Configure an estimator
estimator = sagemaker.estimator.Estimator(training_image,
role,
train_instance_count=1,
train_instance_type='ml.p2.xlarge',
train_volume_size=50,
train_max_run=360000,
input_mode='File',
output_path=s3_output_location)

# Start a SageMaker training job and waits until completion.
estimator.fit('s3://my_bucket/my_training_data/')

# Create a new estimator using the previous' model artifacts
incr_estimator = sagemaker.estimator.Estimator(training_image,
role,
train_instance_count=1,
train_instance_type='ml.p2.xlarge',
train_volume_size=50,
train_max_run=360000,
input_mode='File',
output_path=s3_output_location,
model_uri=estimator.model_data)

# Start a SageMaker training job using the original model for incremental training
incr_estimator.fit('s3://my_bucket/my_training_data/')

Currently, the following algorithms support incremental training:

- Image Classification
- Object Detection
- Semantics Segmentation


MXNet SageMaker Estimators
--------------------------
Expand Down
2 changes: 1 addition & 1 deletion src/sagemaker/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@
from sagemaker.session import s3_input # noqa: F401
from sagemaker.session import get_execution_role # noqa: F401

__version__ = '1.12.0'
__version__ = '1.13.0'
6 changes: 4 additions & 2 deletions src/sagemaker/amazon/amazon_estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,17 +70,19 @@ def data_location(self, data_location):
self._data_location = data_location

@classmethod
def _prepare_init_params_from_job_description(cls, job_details):
def _prepare_init_params_from_job_description(cls, job_details, model_channel_name=None):
"""Convert the job description to init params that can be handled by the class constructor
Args:
job_details: the returned job details from a describe_training_job API call.
model_channel_name (str): Name of the channel where pre-trained model data will be downloaded.
Returns:
dictionary: The transformed init_params
"""
init_params = super(AmazonAlgorithmEstimatorBase, cls)._prepare_init_params_from_job_description(job_details)
init_params = super(AmazonAlgorithmEstimatorBase, cls)._prepare_init_params_from_job_description(
job_details, model_channel_name)

# The hyperparam names may not be the same as the class attribute that holds them,
# for instance: local_lloyd_init_method is called local_init_method. We need to map these
Expand Down
5 changes: 3 additions & 2 deletions src/sagemaker/chainer/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,17 +134,18 @@ def create_model(self, model_server_workers=None, role=None, vpc_config_override
vpc_config=self.get_vpc_config(vpc_config_override))

@classmethod
def _prepare_init_params_from_job_description(cls, job_details):
def _prepare_init_params_from_job_description(cls, job_details, model_channel_name=None):
"""Convert the job description to init params that can be handled by the class constructor

Args:
job_details: the returned job details from a describe_training_job API call.
model_channel_name (str): Name of the channel where pre-trained model data will be downloaded.

Returns:
dictionary: The transformed init_params

"""
init_params = super(Chainer, cls)._prepare_init_params_from_job_description(job_details)
init_params = super(Chainer, cls)._prepare_init_params_from_job_description(job_details, model_channel_name)

for argument in [Chainer._use_mpi, Chainer._num_processes, Chainer._process_slots_per_host,
Chainer._additional_mpi_options]:
Expand Down
Loading