Skip to content

Add support for additional libraries in the Estimator #498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 19, 2018
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@
CHANGELOG
=========

1.15.1.dev
==========
1.15.1dev
=========

* feature: Estimators: lib_dirs attribute allows export of additional libraries into the container
* feature: Add APIs to export Airflow transform and deploy config

1.15.0
Expand Down
17 changes: 17 additions & 0 deletions src/sagemaker/chainer/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,23 @@ The following are optional arguments. When you create a ``Chainer`` object, you
other training source code dependencies including the entry point
file. Structure within this directory will be preserved when training
on SageMaker.
- ``lib_dirs (list[str])`` A list of paths to directories (absolute or relative) with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to call this libs as the entries do not need to be directories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming it to dependencies.

any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
If the ```source_dir``` points to S3, code will be uploaded and the S3 location will be used
instead. Example:

The following call
>>> Estimator(entry_point='train.py', lib_dirs=['my/libs/common', 'virtual-env'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Chainer() not Estimator() Applies to the other ones as well.

results in the following inside the container:

>>> $ ls

>>> opt/ml/code
>>> ├── train.py
>>> ├── common
>>> └── virtual-env

- ``hyperparameters`` Hyperparameters that will be used for training.
Will be made accessible as a dict[str, str] to the training code on
SageMaker. For convenience, accepts other types besides str, but
Expand Down
2 changes: 1 addition & 1 deletion src/sagemaker/chainer/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ def create_model(self, model_server_workers=None, role=None, vpc_config_override
py_version=self.py_version, framework_version=self.framework_version,
model_server_workers=model_server_workers, image=self.image_name,
sagemaker_session=self.sagemaker_session,
vpc_config=self.get_vpc_config(vpc_config_override))
vpc_config=self.get_vpc_config(vpc_config_override), lib_dirs=self.lib_dirs)

@classmethod
def _prepare_init_params_from_job_description(cls, job_details, model_channel_name=None):
Expand Down
22 changes: 20 additions & 2 deletions src/sagemaker/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -637,7 +637,7 @@ class Framework(EstimatorBase):
LAUNCH_PS_ENV_NAME = 'sagemaker_parameter_server_enabled'

def __init__(self, entry_point, source_dir=None, hyperparameters=None, enable_cloudwatch_metrics=False,
container_log_level=logging.INFO, code_location=None, image_name=None, **kwargs):
container_log_level=logging.INFO, code_location=None, image_name=None, lib_dirs=None, **kwargs):
"""Base class initializer. Subclasses which override ``__init__`` should invoke ``super()``

Args:
Expand All @@ -646,6 +646,22 @@ def __init__(self, entry_point, source_dir=None, hyperparameters=None, enable_cl
source_dir (str): Path (absolute or relative) to a directory with any other training
source code dependencies aside from tne entry point file (default: None). Structure within this
directory are preserved when training on Amazon SageMaker.
lib_dirs (list[str]): A list of paths to directories (absolute or relative) with
any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
Example:

The following call
>>> Estimator(entry_point='train.py', lib_dirs=['my/libs/common', 'virtual-env'])
results in the following inside the container:

>>> $ ls

>>> opt/ml/code
>>> ├── train.py
>>> ├── common
>>> └── virtual-env

hyperparameters (dict): Hyperparameters that will be used for training (default: None).
The hyperparameters are made accessible as a dict[str, str] to the training code on SageMaker.
For convenience, this accepts other types for keys and values, but ``str()`` will be called
Expand All @@ -663,6 +679,7 @@ def __init__(self, entry_point, source_dir=None, hyperparameters=None, enable_cl
"""
super(Framework, self).__init__(**kwargs)
self.source_dir = source_dir
self.lib_dirs = lib_dirs or []
self.entry_point = entry_point
if enable_cloudwatch_metrics:
warnings.warn('enable_cloudwatch_metrics is now deprecated and will be removed in the future.',
Expand Down Expand Up @@ -729,7 +746,8 @@ def _stage_user_code_in_s3(self):
bucket=code_bucket,
s3_key_prefix=code_s3_prefix,
script=self.entry_point,
directory=self.source_dir)
directory=self.source_dir,
lib_dirs=self.lib_dirs)

def _model_source_dir(self):
"""Get the appropriate value to pass as source_dir to model constructor on deploying
Expand Down
47 changes: 29 additions & 18 deletions src/sagemaker/fw_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,15 @@

import os
import re
import shutil
import tempfile
from collections import namedtuple
from six.moves.urllib.parse import urlparse

import sagemaker.utils

_TAR_SOURCE_FILENAME = 'source.tar.gz'

UploadedCode = namedtuple('UserCode', ['s3_prefix', 'script_name'])
"""sagemaker.fw_utils.UserCode: An object containing the S3 prefix and script name.

Expand Down Expand Up @@ -107,7 +111,7 @@ def validate_source_dir(script, directory):
return True


def tar_and_upload_dir(session, bucket, s3_key_prefix, script, directory):
def tar_and_upload_dir(session, bucket, s3_key_prefix, script, directory, lib_dirs=None):
"""Pack and upload source files to S3 only if directory is empty or local.

Note:
Expand All @@ -118,31 +122,38 @@ def tar_and_upload_dir(session, bucket, s3_key_prefix, script, directory):
bucket (str): S3 bucket to which the compressed file is uploaded.
s3_key_prefix (str): Prefix for the S3 key.
script (str): Script filename.
directory (str): Directory containing the source file. If it starts with "s3://", no action is taken.
directory (str or None): Directory containing the source file. If it starts with
"s3://", no action is taken.
lib_dirs (List[str]): A list of paths to directories (absolute or relative)
containing additional libraries that will be copied into
/opt/ml/lib

Returns:
sagemaker.fw_utils.UserCode: An object with the S3 bucket and key (S3 prefix) and script name.
"""
if directory:
if directory.lower().startswith("s3://"):
return UploadedCode(s3_prefix=directory, script_name=os.path.basename(script))
else:
script_name = script
source_files = [os.path.join(directory, name) for name in os.listdir(directory)]
lib_dirs = lib_dirs or []
key = '%s/sourcedir.tar.gz' % s3_key_prefix

if directory and directory.lower().startswith('s3://'):
return UploadedCode(s3_prefix=directory, script_name=os.path.basename(script))
else:
# If no directory is specified, the script parameter needs to be a valid relative path.
os.path.exists(script)
script_name = os.path.basename(script)
source_files = [script]
tmp = tempfile.mkdtemp()

try:
source_files = _list_files_to_compress(script, directory) + lib_dirs
tar_file = sagemaker.utils.create_tar_file(source_files, os.path.join(tmp, _TAR_SOURCE_FILENAME))

session.resource('s3').Object(bucket, key).upload_file(tar_file)
finally:
shutil.rmtree(tmp)

s3 = session.resource('s3')
key = '{}/{}'.format(s3_key_prefix, 'sourcedir.tar.gz')
script_name = script if directory else os.path.basename(script)
return UploadedCode(s3_prefix='s3://%s/%s' % (bucket, key), script_name=script_name)

tar_file = sagemaker.utils.create_tar_file(source_files)
s3.Object(bucket, key).upload_file(tar_file)
os.remove(tar_file)

return UploadedCode(s3_prefix='s3://{}/{}'.format(bucket, key), script_name=script_name)
def _list_files_to_compress(script, directory):
basedir = directory if directory else os.path.dirname(script)
return [os.path.join(basedir, name) for name in os.listdir(basedir)]


def framework_name_from_image(image_name):
Expand Down
23 changes: 21 additions & 2 deletions src/sagemaker/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ class FrameworkModel(Model):

def __init__(self, model_data, image, role, entry_point, source_dir=None, predictor_cls=None, env=None, name=None,
enable_cloudwatch_metrics=False, container_log_level=logging.INFO, code_location=None,
sagemaker_session=None, **kwargs):
sagemaker_session=None, lib_dirs=None, **kwargs):
"""Initialize a ``FrameworkModel``.

Args:
Expand All @@ -140,6 +140,23 @@ def __init__(self, model_data, image, role, entry_point, source_dir=None, predic
source code dependencies aside from tne entry point file (default: None). Structure within this
directory will be preserved when training on SageMaker.
If the directory points to S3, no code will be uploaded and the S3 location will be used instead.
lib_dirs (list[str]): A list of paths to directories (absolute or relative) with
any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
If the ```source_dir``` points to S3, code will be uploaded and the S3 location will be used
instead. Example:

The following call
>>> Estimator(entry_point='train.py', lib_dirs=['my/libs/common', 'virtual-env'])
results in the following inside the container:

>>> $ ls

>>> opt/ml/code
>>> ├── train.py
>>> ├── common
>>> └── virtual-env

predictor_cls (callable[string, sagemaker.session.Session]): A function to call to create
a predictor (default: None). If not None, ``deploy`` will return the result of invoking
this function on the created endpoint name.
Expand All @@ -160,6 +177,7 @@ def __init__(self, model_data, image, role, entry_point, source_dir=None, predic
sagemaker_session=sagemaker_session, **kwargs)
self.entry_point = entry_point
self.source_dir = source_dir
self.lib_dirs = lib_dirs or []
self.enable_cloudwatch_metrics = enable_cloudwatch_metrics
self.container_log_level = container_log_level
if code_location:
Expand Down Expand Up @@ -194,7 +212,8 @@ def _upload_code(self, key_prefix):
bucket=self.bucket or self.sagemaker_session.default_bucket(),
s3_key_prefix=key_prefix,
script=self.entry_point,
directory=self.source_dir)
directory=self.source_dir,
lib_dirs=self.lib_dirs)

def _framework_env_vars(self):
if self.uploaded_code:
Expand Down
17 changes: 17 additions & 0 deletions src/sagemaker/mxnet/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,23 @@ The following are optional arguments. When you create an ``MXNet`` object, you c
other training source code dependencies including the entry point
file. Structure within this directory will be preserved when training
on SageMaker.
- ``lib_dirs (list[str])`` A list of paths to directories (absolute or relative) with
any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
If the ```source_dir``` points to S3, code will be uploaded and the S3 location will be used
instead. Example:

The following call
>>> Estimator(entry_point='train.py', lib_dirs=['my/libs/common', 'virtual-env'])
results in the following inside the container:

>>> $ ls

>>> opt/ml/code
>>> ├── train.py
>>> ├── common
>>> └── virtual-env

- ``hyperparameters`` Hyperparameters that will be used for training.
Will be made accessible as a dict[str, str] to the training code on
SageMaker. For convenience, accepts other types besides str, but
Expand Down
2 changes: 1 addition & 1 deletion src/sagemaker/mxnet/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def create_model(self, model_server_workers=None, role=None, vpc_config_override
container_log_level=self.container_log_level, code_location=self.code_location,
py_version=self.py_version, framework_version=self.framework_version, image=self.image_name,
model_server_workers=model_server_workers, sagemaker_session=self.sagemaker_session,
vpc_config=self.get_vpc_config(vpc_config_override))
vpc_config=self.get_vpc_config(vpc_config_override), lib_dirs=self.lib_dirs)

@classmethod
def _prepare_init_params_from_job_description(cls, job_details, model_channel_name=None):
Expand Down
17 changes: 17 additions & 0 deletions src/sagemaker/pytorch/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,23 @@ The following are optional arguments. When you create a ``PyTorch`` object, you
other training source code dependencies including the entry point
file. Structure within this directory will be preserved when training
on SageMaker.
- ``lib_dirs (list[str])`` A list of paths to directories (absolute or relative) with
any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
If the ```source_dir``` points to S3, code will be uploaded and the S3 location will be used
instead. Example:

The following call
>>> Estimator(entry_point='train.py', lib_dirs=['my/libs/common', 'virtual-env'])
results in the following inside the container:

>>> $ ls

>>> opt/ml/code
>>> ├── train.py
>>> ├── common
>>> └── virtual-env

- ``hyperparameters`` Hyperparameters that will be used for training.
Will be made accessible as a dict[str, str] to the training code on
SageMaker. For convenience, accepts other types besides strings, but
Expand Down
2 changes: 1 addition & 1 deletion src/sagemaker/pytorch/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def create_model(self, model_server_workers=None, role=None, vpc_config_override
container_log_level=self.container_log_level, code_location=self.code_location,
py_version=self.py_version, framework_version=self.framework_version, image=self.image_name,
model_server_workers=model_server_workers, sagemaker_session=self.sagemaker_session,
vpc_config=self.get_vpc_config(vpc_config_override))
vpc_config=self.get_vpc_config(vpc_config_override), lib_dirs=self.lib_dirs)

@classmethod
def _prepare_init_params_from_job_description(cls, job_details, model_channel_name=None):
Expand Down
17 changes: 17 additions & 0 deletions src/sagemaker/tensorflow/README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,23 @@ you can specify these as keyword arguments.
other training source code dependencies including the entry point
file. Structure within this directory will be preserved when training
on SageMaker.
- ``lib_dirs (list[str])`` A list of paths to directories (absolute or relative) with
any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
If the ```source_dir``` points to S3, code will be uploaded and the S3 location will be used
instead. Example:

The following call
>>> Estimator(entry_point='train.py', lib_dirs=['my/libs/common', 'virtual-env'])
results in the following inside the container:

>>> $ ls

>>> opt/ml/code
>>> ├── train.py
>>> ├── common
>>> └── virtual-env

- ``requirements_file (str)`` Path to a ``requirements.txt`` file. The path should
be within and relative to ``source_dir``. This is a file containing a list of items to be
installed using pip install. Details on the format can be found in the
Expand Down
3 changes: 2 additions & 1 deletion src/sagemaker/tensorflow/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -411,7 +411,8 @@ def _create_default_model(self, model_server_workers, role, vpc_config_override)
framework_version=self.framework_version,
model_server_workers=model_server_workers,
sagemaker_session=self.sagemaker_session,
vpc_config=self.get_vpc_config(vpc_config_override))
vpc_config=self.get_vpc_config(vpc_config_override),
lib_dirs=self.lib_dirs)

def hyperparameters(self):
"""Return hyperparameters used by your custom TensorFlow code during model training."""
Expand Down
40 changes: 40 additions & 0 deletions tests/data/pytorch_source_dirs/train.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
# import os
# import sys
# import tarfile
#
# lib_dir = '/opt/ml/lib'
#
# if not os.path.exists(lib_dir):
# os.makedirs(lib_dir)
#
# with tarfile.open(name=os.path.join(os.path.dirname(__file__), 'opt_ml_lib.tar.gz'), mode='r:gz') as t:
# t.extractall(path=lib_dir)
#
# sys.path.insert(0, lib_dir)

import alexa


def model_fn(anything):
return alexa


def predict_fn(input_object, model):
return input_object


if __name__ == '__main__':
with open('/opt/ml/model/answer', 'w') as model:
model.write(str(alexa.question('How many roads must a man walk down?')))
Loading