Skip to content

Commit ed4ce41

Browse files
authored
Merge branch 'master' into keras_fn
2 parents 8ca7361 + eefd0c9 commit ed4ce41

23 files changed

+304
-80
lines changed

CHANGELOG.rst

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,17 @@
22
CHANGELOG
33
=========
44

5-
1.9.1dev
6-
========
5+
1.9.2
6+
=====
7+
8+
* feature: add support for TensorFlow 1.9
9+
10+
1.9.1
11+
=====
712

813
* bug-fix: Estimators: Fix serialization of single records
14+
* bug-fix: deprecate enable_cloudwatch_metrics from Framework Estimators.
15+
* enhancement: Enable VPC config in training job creation
916

1017
1.9.0
1118
=====

README.rst

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,8 @@ You can install from source by cloning this repository and running a pip install
5353
::
5454

5555
git clone https://github.com/aws/sagemaker-python-sdk.git
56-
python setup.py sdist
57-
pip install dist/sagemaker-1.9.0.tar.gz
56+
cd sagemaker-python-sdk
57+
pip install .
5858

5959
Supported Operating Systems
6060
~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -244,6 +244,8 @@ By using MXNet SageMaker ``Estimators``, you can train and host MXNet models on
244244

245245
Supported versions of MXNet: ``1.2.1``, ``1.1.0``, ``1.0.0``, ``0.12.1``.
246246

247+
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
248+
247249
For more information, see `MXNet SageMaker Estimators and Models`_.
248250

249251
.. _MXNet SageMaker Estimators and Models: src/sagemaker/mxnet/README.rst
@@ -254,7 +256,9 @@ TensorFlow SageMaker Estimators
254256

255257
By using TensorFlow SageMaker ``Estimators``, you can train and host TensorFlow models on Amazon SageMaker.
256258

257-
Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``, ``1.6.0``, ``1.7.0``, ``1.8.0``.
259+
Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``, ``1.6.0``, ``1.7.0``, ``1.8.0``, ``1.9.0``.
260+
261+
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
258262

259263
For more information, see `TensorFlow SageMaker Estimators and Models`_.
260264

@@ -268,6 +272,8 @@ By using Chainer SageMaker ``Estimators``, you can train and host Chainer models
268272

269273
Supported versions of Chainer: ``4.0.0``, ``4.1.0``.
270274

275+
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
276+
271277
For more information about Chainer, see https://github.com/chainer/chainer.
272278

273279
For more information about Chainer SageMaker ``Estimators``, see `Chainer SageMaker Estimators and Models`_.
@@ -280,7 +286,9 @@ PyTorch SageMaker Estimators
280286

281287
With PyTorch SageMaker ``Estimators``, you can train and host PyTorch models on Amazon SageMaker.
282288

283-
Supported versions of PyTorch: ``0.4.0``
289+
Supported versions of PyTorch: ``0.4.0``.
290+
291+
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
284292

285293
For more information about PyTorch, see https://github.com/pytorch/pytorch.
286294

@@ -326,7 +334,7 @@ SageMaker Automatic Model Tuning
326334
All of the estimators can be used with SageMaker Automatic Model Tuning, which performs hyperparameter tuning jobs.
327335
A hyperparameter tuning job finds the best version of a model by running many training jobs on your dataset using the algorithm with different values of hyperparameters within ranges
328336
that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose.
329-
If you're not using an Amazon SageMaker built-in algorithm, then the metric is defined by a regular expression (regex) you provide.
337+
If you're not using an Amazon SageMaker built-in algorithm, then the metric is defined by a regular expression (regex) you provide.
330338
The hyperparameter tuning job parses the training job's logs to find metrics that match the regex you defined.
331339
For more information about SageMaker Automatic Model Tuning, see `AWS documentation <https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html>`__.
332340

@@ -377,7 +385,7 @@ In addition, the ``fit()`` call uses a list of ``RecordSet`` objects instead of
377385
# Start hyperparameter tuning job
378386
my_tuner.fit([train_records, test_records])
379387
380-
To help attach a previously-started hyperparameter tuning job to a ``HyperparameterTuner`` instance,
388+
To help attach a previously-started hyperparameter tuning job to a ``HyperparameterTuner`` instance,
381389
``fit()`` adds the module path of the class used to create the tuner to the list of static hyperparameters by default.
382390
If the algorithm you are using cannot handle unknown hyperparameters
383391
(for example, an Amazon SageMaker built-in algorithm that does not have a custom estimator in the Python SDK),
@@ -521,4 +529,4 @@ After that, invoke the ``deploy()`` method on the ``Model``:
521529
522530
This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is called. You can now get inferences just like with any other model deployed on Amazon SageMaker.
523531

524-
A full example is available in the `Amazon SageMaker examples repository <https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/mxnet_mnist_byom>`__.
532+
A full example is available in the `Amazon SageMaker examples repository <https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/mxnet_mnist_byom>`__.

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ def read(fname):
2323

2424

2525
setup(name="sagemaker",
26-
version="1.9.0",
26+
version="1.9.2",
2727
description="Open source library for training and deploying models on Amazon SageMaker.",
2828
packages=find_packages('src'),
2929
package_dir={'': 'src'},

src/sagemaker/estimator.py

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
import json
1616
import logging
1717
import os
18+
import warnings
1819
from abc import ABCMeta
1920
from abc import abstractmethod
2021
from six import with_metaclass
@@ -46,7 +47,8 @@ class EstimatorBase(with_metaclass(ABCMeta, object)):
4647

4748
def __init__(self, role, train_instance_count, train_instance_type,
4849
train_volume_size=30, train_max_run=24 * 60 * 60, input_mode='File',
49-
output_path=None, output_kms_key=None, base_job_name=None, sagemaker_session=None, tags=None):
50+
output_path=None, output_kms_key=None, base_job_name=None, sagemaker_session=None, tags=None,
51+
subnets=None, security_group_ids=None):
5052
"""Initialize an ``EstimatorBase`` instance.
5153
5254
Args:
@@ -77,6 +79,9 @@ def __init__(self, role, train_instance_count, train_instance_type,
7779
using the default AWS configuration chain.
7880
tags (list[dict]): List of tags for labeling a training job. For more, see
7981
https://docs.aws.amazon.com/sagemaker/latest/dg/API_Tag.html.
82+
subnets (list[str]): List of subnet ids. If not specified training job will be created without VPC config.
83+
security_group_ids (list[str]): List of security group ids. If not specified training job will be created
84+
without VPC config.
8085
"""
8186
self.role = role
8287
self.train_instance_count = train_instance_count
@@ -99,6 +104,10 @@ def __init__(self, role, train_instance_count, train_instance_type,
99104
self.output_kms_key = output_kms_key
100105
self.latest_training_job = None
101106

107+
# VPC configurations
108+
self.subnets = subnets
109+
self.security_group_ids = security_group_ids
110+
102111
@abstractmethod
103112
def train_image(self):
104113
"""Return the Docker image to use for training.
@@ -398,8 +407,9 @@ def start_new(cls, estimator, inputs):
398407
estimator.sagemaker_session.train(image=estimator.train_image(), input_mode=estimator.input_mode,
399408
input_config=config['input_config'], role=config['role'],
400409
job_name=estimator._current_job_name, output_config=config['output_config'],
401-
resource_config=config['resource_config'], hyperparameters=hyperparameters,
402-
stop_condition=config['stop_condition'], tags=estimator.tags)
410+
resource_config=config['resource_config'], vpc_config=config['vpc_config'],
411+
hyperparameters=hyperparameters, stop_condition=config['stop_condition'],
412+
tags=estimator.tags)
403413

404414
return cls(estimator.sagemaker_session, estimator._current_job_name)
405415

@@ -550,8 +560,8 @@ def __init__(self, entry_point, source_dir=None, hyperparameters=None, enable_cl
550560
The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker.
551561
For convenience, this accepts other types for keys and values, but ``str()`` will be called
552562
to convert them before training.
553-
enable_cloudwatch_metrics (bool): Whether training and hosting containers will
554-
generate CloudWatch metrics under the AWS/SageMakerContainer namespace (default: False).
563+
enable_cloudwatch_metrics (bool): [DEPRECATED] Now there are cloudwatch metrics emitted by all SageMaker
564+
training jobs. This will be ignored for now and removed in a further release.
555565
container_log_level (int): Log level to use within the container (default: logging.INFO).
556566
Valid values are defined in the Python logging module.
557567
code_location (str): Name of the S3 bucket where custom code is uploaded (default: None).
@@ -564,7 +574,10 @@ def __init__(self, entry_point, source_dir=None, hyperparameters=None, enable_cl
564574
super(Framework, self).__init__(**kwargs)
565575
self.source_dir = source_dir
566576
self.entry_point = entry_point
567-
self.enable_cloudwatch_metrics = enable_cloudwatch_metrics
577+
if enable_cloudwatch_metrics:
578+
warnings.warn('enable_cloudwatch_metrics is now deprecated and will be removed in the future.',
579+
DeprecationWarning)
580+
self.enable_cloudwatch_metrics = False
568581
self.container_log_level = container_log_level
569582
self._hyperparameters = hyperparameters or {}
570583
self.code_location = code_location

src/sagemaker/job.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,12 +59,14 @@ def _load_config(inputs, estimator):
5959
estimator.train_instance_type,
6060
estimator.train_volume_size)
6161
stop_condition = _Job._prepare_stop_condition(estimator.train_max_run)
62+
vpc_config = _Job._prepare_vpc_config(estimator.subnets, estimator.security_group_ids)
6263

6364
return {'input_config': input_config,
6465
'role': role,
6566
'output_config': output_config,
6667
'resource_config': resource_config,
67-
'stop_condition': stop_condition}
68+
'stop_condition': stop_condition,
69+
'vpc_config': vpc_config}
6870

6971
@staticmethod
7072
def _format_inputs_to_input_config(inputs):
@@ -143,6 +145,13 @@ def _prepare_resource_config(instance_count, instance_type, volume_size):
143145
'InstanceType': instance_type,
144146
'VolumeSizeInGB': volume_size}
145147

148+
@staticmethod
149+
def _prepare_vpc_config(subnets, security_group_ids):
150+
if subnets is None or security_group_ids is None:
151+
return None
152+
return {'Subnets': subnets,
153+
'SecurityGroupIds': security_group_ids}
154+
146155
@staticmethod
147156
def _prepare_stop_condition(max_run):
148157
return {'MaxRuntimeInSeconds': max_run}

src/sagemaker/mxnet/README.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -543,9 +543,6 @@ The MXNetModel constructor takes the following arguments:
543543
directory with any other training source code dependencies including
544544
tne entry point file. Structure within this directory will be
545545
preserved when training on SageMaker.
546-
- ``enable_cloudwatch_metrics (boolean):`` Optional. If true, training
547-
and hosting containers will generate Cloudwatch metrics under the
548-
AWS/SageMakerContainer namespace.
549546
- ``container_log_level (int):`` Log level to use within the container.
550547
Valid values are defined in the Python logging module.
551548
- ``code_location (str):`` Optional. Name of the S3 bucket where your

src/sagemaker/pytorch/README.rst

Lines changed: 8 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
21
=======================================
32
SageMaker PyTorch Estimators and Models
43
=======================================
@@ -39,24 +38,21 @@ using to submit the script to SageMaker via a ``PyTorch`` Estimator. This will b
3938
Suppose that you already have a PyTorch training script called `pytorch-train.py`.
4039
You can then setup a ``PyTorch`` Estimator with keyword arguments to point to this script and define how SageMaker runs it:
4140

42-
```python
41+
.. code:: python
4342
4443
from sagemaker.pytorch import PyTorch
4544
46-
pytorch_estimator = PyTorch(entry_point="pytorch-train.py",
47-
role="SageMakerRole",
48-
train_instance_type="ml.p3.2xlarge",
45+
pytorch_estimator = PyTorch(entry_point='pytorch-train.py',
46+
role='SageMakerRole',
47+
train_instance_type='ml.p3.2xlarge',
4948
train_instance_count=1)
50-
```
5149
5250
After that, you simply tell the estimator to start a training job and provide an S3 URL
5351
that is the path to your training data within Amazon S3:
5452

55-
```python
56-
57-
pytorch_estimator.fit("s3://bucket/path/to/training/data")
53+
.. code:: python
5854
59-
```
55+
pytorch_estimator.fit('s3://bucket/path/to/training/data')
6056
6157
In the following sections, we'll discuss how to prepare a training script for execution on SageMaker,
6258
then how to run that script on SageMaker using a ``PyTorch`` Estimator.
@@ -443,7 +439,7 @@ the model server receives two pieces of information:
443439
- The request data body, a byte array which is at most 5 MB (5 \* 1024
444440
\* 1024 bytes) in size.
445441

446-
The SageMaker PyTorch model server will invoke an "input_fn" function in your hosting script,
442+
The SageMaker PyTorch model server will invoke an ``input_fn`` function in your hosting script,
447443
passing in this information. If you define an ``input_fn`` function definition,
448444
it should return an object that can be passed to ``predict_fn`` and have the following signature:
449445

@@ -647,7 +643,7 @@ Please refer to:
647643

648644
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk
649645

650-
These are also available in SageMaker Notebook Instance hosted Jupyter notebooks under the "sample notebooks" folder.
646+
These are also available in SageMaker Notebook Instance hosted Jupyter notebooks under the sample notebooks folder.
651647

652648

653649
SageMaker PyTorch Docker Containers

src/sagemaker/session.py

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@
2929
from sagemaker.utils import name_from_image, secondary_training_status_message, secondary_training_status_changed
3030
import sagemaker.logs
3131

32-
3332
logging.basicConfig()
3433
LOGGER = logging.getLogger('sagemaker')
3534
LOGGER.setLevel(logging.INFO)
@@ -93,7 +92,12 @@ def _initialize(self, boto_session, sagemaker_client, sagemaker_runtime_client):
9392
self.sagemaker_client = sagemaker_client or self.boto_session.client('sagemaker')
9493
prepend_user_agent(self.sagemaker_client)
9594

96-
self.sagemaker_runtime_client = sagemaker_runtime_client or self.boto_session.client('runtime.sagemaker')
95+
if sagemaker_runtime_client is not None:
96+
self.sagemaker_runtime_client = sagemaker_runtime_client
97+
else:
98+
config = botocore.config.Config(read_timeout=80)
99+
self.sagemaker_runtime_client = self.boto_session.client('runtime.sagemaker', config=config)
100+
97101
prepend_user_agent(self.sagemaker_runtime_client)
98102

99103
self.local_mode = False
@@ -202,7 +206,7 @@ def default_bucket(self):
202206
return self._default_bucket
203207

204208
def train(self, image, input_mode, input_config, role, job_name, output_config,
205-
resource_config, hyperparameters, stop_condition, tags):
209+
resource_config, vpc_config, hyperparameters, stop_condition, tags):
206210
"""Create an Amazon SageMaker training job.
207211
208212
Args:
@@ -228,6 +232,13 @@ def train(self, image, input_mode, input_config, role, job_name, output_config,
228232
* instance_type (str): Type of EC2 instance to use for training, for example, 'ml.c4.xlarge'.
229233
The key in resource_config is 'InstanceType'.
230234
235+
vpc_config (dict): Contains values for VpcConfig:
236+
237+
* subnets (list[str]): List of subnet ids.
238+
The key in vpc_config is 'Subnets'.
239+
* security_group_ids (list[str]): List of security group ids.
240+
The key in vpc_config is 'SecurityGroupIds'.
241+
231242
hyperparameters (dict): Hyperparameters for model training. The hyperparameters are made accessible as
232243
a dict[str, str] to the training code on SageMaker. For convenience, this accepts other types for
233244
keys and values, but ``str()`` will be called to convert them before training.
@@ -259,6 +270,9 @@ def train(self, image, input_mode, input_config, role, job_name, output_config,
259270
if tags is not None:
260271
train_request['Tags'] = tags
261272

273+
if vpc_config is not None:
274+
train_request['VpcConfig'] = vpc_config
275+
262276
LOGGER.info('Creating training-job with name: {}'.format(job_name))
263277
LOGGER.debug('train request: {}'.format(json.dumps(train_request, indent=4)))
264278
self.sagemaker_client.create_training_job(**train_request)
@@ -1018,7 +1032,6 @@ def _deployment_entity_exists(describe_fn):
10181032

10191033

10201034
def _train_done(sagemaker_client, job_name, last_desc):
1021-
10221035
in_progress_statuses = ['InProgress', 'Created']
10231036

10241037
desc = sagemaker_client.describe_training_job(TrainingJobName=job_name)

src/sagemaker/tensorflow/README.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ TensorFlow SageMaker Estimators allow you to run your own TensorFlow
66
training algorithms on SageMaker Learner, and to host your own TensorFlow
77
models on SageMaker Hosting.
88

9-
Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``, ``1.6.0``, ``1.7.0``, ``1.8.0``.
9+
Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``, ``1.6.0``, ``1.7.0``, ``1.8.0``, ``1.9.0``.
1010

1111
Training with TensorFlow
1212
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -833,7 +833,7 @@ SageMaker TensorFlow CPU images use TensorFlow built with Intel® MKL-DNN optimi
833833
In certain cases you might be able to get a better performance by disabling this optimization
834834
(`for example when using small models <https://github.com/awslabs/amazon-sagemaker-examples/blob/d88d1c19861fb7733941969f5a68821d9da2982e/sagemaker-python-sdk/tensorflow_iris_dnn_classifier_using_estimators/iris_dnn_classifier.py#L7-L9>`_)
835835

836-
You can disable MKL-DNN optimization for TensorFlow ``1.8.0`` by setting two following environment variables:
836+
You can disable MKL-DNN optimization for TensorFlow ``1.8.0`` and above by setting two following environment variables:
837837

838838
.. code:: python
839839

src/sagemaker/tensorflow/defaults.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,4 @@
1212
# language governing permissions and limitations under the License.
1313
from __future__ import absolute_import
1414

15-
TF_VERSION = '1.8'
15+
TF_VERSION = '1.9'

0 commit comments

Comments
 (0)