Skip to content

feature: git support for hosting models #878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 62 commits into from
Jul 8, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
10d27c5
add git_config and validate method
Jun 6, 2019
db8652c
Merge branch 'master' of github.com:aws/sagemaker-python-sdk into clo…
Jun 6, 2019
6b78ed4
modify the order of git_config, add tests
Jun 6, 2019
e59bb79
move validate_git_config, add integ test
Jun 8, 2019
7808faa
modify location _git_clone_code called
Jun 10, 2019
2783c4a
add documentation
Jun 11, 2019
db3b69f
Merge branch 'master' of github.com:aws/sagemaker-python-sdk into clo…
Jun 11, 2019
685750a
create git_utils, write test for dependencies
Jun 12, 2019
52e5625
create git_utils, write test for dependencies
Jun 12, 2019
f397850
Update doc/overview.rst
GaryTu1020 Jun 12, 2019
5b8d684
Update doc/overview.rst
GaryTu1020 Jun 12, 2019
a9e2932
add more integ tests
Jun 13, 2019
241ac92
write unit tests for git_utils
Jun 15, 2019
a81859a
fix conflict on overview.rst
Jun 15, 2019
c39c344
delete a line
Jun 15, 2019
068a7b1
modify an assertion in test_with_mxnet
Jun 15, 2019
a8f8731
delete unnecessary comments
Jun 17, 2019
2b1622b
add assertion to some test functions
Jun 17, 2019
28a5c58
remove deploy part in test_git
Jun 17, 2019
0797060
change testing git repo
Jun 17, 2019
55f109e
manually merge with clone_from_github
Jun 17, 2019
e2e5c20
change the testing repo
Jun 17, 2019
3477477
write some test cases for serving code
Jun 18, 2019
7b53f20
write a test
Jun 18, 2019
c6daa5d
correct an error message
Jun 18, 2019
e8bede0
pull master
Jun 18, 2019
e5bd806
stop patching private methods
Jun 18, 2019
4b00749
add tests in test_model.py
Jun 19, 2019
c1bae10
modified overview.rst, add lock for tests
Jun 19, 2019
57e9903
merge with clone_from_github
Jun 19, 2019
2af9b24
slight change to overview.rst
Jun 19, 2019
e15a22d
Merge branch 'master' into clone_from_github
chuyang-deng Jun 19, 2019
b102563
add a comment for lock
Jun 19, 2019
9ae910e
merge with remote branch
Jun 19, 2019
3383bfc
Merge branch 'master' into clone_from_github
GaryTu1020 Jun 20, 2019
9a7f4e1
Merge branch 'master' into clone_from_github
chuyang-deng Jun 20, 2019
0b69c1b
add a integ test with sklearn
Jun 20, 2019
d4bb0bb
merge with master
Jun 21, 2019
e6a01f0
merge with master
Jun 21, 2019
fb6138a
merge with clone_from_github
Jun 21, 2019
0c5e32b
merge aws master
Jun 21, 2019
b6e75d0
merge with master
Jun 21, 2019
6f57c2b
merge with clone_from_github
Jun 21, 2019
3621bd4
merge with master
Jun 21, 2019
f6a2082
merge with clone_from_github
Jun 21, 2019
76fcac8
merge with aws master
Jun 24, 2019
affa29e
add documentation for serving
Jun 24, 2019
61ebe87
merge with aws master
Jun 24, 2019
70ba433
mock git_clone_repo for clone fail tests
Jun 24, 2019
6453c91
delete unnecessary files
Jun 25, 2019
a74d91e
Merge branch 'master' into serving_code
chuyang-deng Jun 25, 2019
33045e8
change infence script for mxnet
Jun 25, 2019
9df9582
address pr commments
Jun 26, 2019
3ba4c1e
Merge branch 'master' into serving_code
GaryTu1020 Jul 2, 2019
e499b3e
merge with aws master and modified _validate_git_config
Jul 2, 2019
e526206
mark test_git_support_with_sklearn as local_mode
Jul 2, 2019
b7e3236
skip the sklearn test for py2 since it is not supported
Jul 3, 2019
7bf31ff
rm .python-version
Jul 3, 2019
54b44bb
Update doc/overview.rst
GaryTu1020 Jul 3, 2019
32a46b1
merge with aws master
Jul 8, 2019
64dc5ac
merge with origin serving_code
Jul 8, 2019
8ba5c96
Merge branch 'master' into serving_code
chuyang-deng Jul 8, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions doc/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -799,6 +799,23 @@ After that, invoke the ``deploy()`` method on the ``Model``:

This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is called. You can now get inferences just like with any other model deployed on Amazon SageMaker.

Git support is also available when you bring your own model, through which you can use inference scripts stored in your
Git repositories. The process is similar to using Git support for training jobs. You can simply provide ``git_config``
when create the ``Model`` object, and let ``entry_point``, ``source_dir`` and ``dependencies`` (if needed) be relative
paths inside the Git repository:

.. code:: python

git_config = {'repo': 'https://github.com/username/repo-with-training-scripts.git',
'branch': 'branch1',
'commit': '4893e528afa4a790331e1b5286954f073b0f14a2'}

sagemaker_model = MXNetModel(model_data='s3://path/to/model.tar.gz',
role='arn:aws:iam::accid:sagemaker-role',
entry_point='inference.py',
source_dir='mxnet',
git_config=git_config)

A full example is available in the `Amazon SageMaker examples repository <https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/mxnet_mnist_byom>`__.

You can also find this notebook in the **Advanced Functionality** section of the **SageMaker Examples** section in a notebook instance.
Expand Down
14 changes: 8 additions & 6 deletions src/sagemaker/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -955,8 +955,8 @@ def __init__(
code_location=None,
image_name=None,
dependencies=None,
git_config=None,
enable_network_isolation=False,
git_config=None,
**kwargs
):
"""Base class initializer. Subclasses which override ``__init__`` should invoke ``super()``
Expand Down Expand Up @@ -993,7 +993,7 @@ def __init__(
source_dir (str): Path (absolute or relative) to a directory with any other training
source code dependencies aside from the entry point file (default: None). Structure within this
directory are preserved when training on Amazon SageMaker. If 'git_config' is provided,
source_dir should be a relative location to a directory in the Git repo.
'source_dir' should be a relative location to a directory in the Git repo.
Example:

With the following GitHub repo directory structure:
Expand Down Expand Up @@ -1023,6 +1023,8 @@ def __init__(
dependencies (list[str]): A list of paths to directories (absolute or relative) with
any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
If 'git_config' is provided, 'dependencies' should be a list of relative locations to directories
with any additional libraries needed in the Git repo.
Example:

The following call
Expand Down Expand Up @@ -1085,12 +1087,12 @@ def _prepare_for_training(self, job_name=None):
super(Framework, self)._prepare_for_training(job_name=job_name)

if self.git_config:
updates = git_utils.git_clone_repo(
updated_paths = git_utils.git_clone_repo(
self.git_config, self.entry_point, self.source_dir, self.dependencies
)
self.entry_point = updates["entry_point"]
self.source_dir = updates["source_dir"]
self.dependencies = updates["dependencies"]
self.entry_point = updated_paths["entry_point"]
self.source_dir = updated_paths["source_dir"]
self.dependencies = updated_paths["dependencies"]

# validate source dir will raise a ValueError if there is something wrong with the
# source directory. We are intentionally not handling it because this is a critical error.
Expand Down
39 changes: 29 additions & 10 deletions src/sagemaker/git_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from __future__ import absolute_import

import os
import six
import subprocess
import tempfile

Expand All @@ -39,35 +40,46 @@ def git_clone_repo(git_config, entry_point, source_dir=None, dependencies=None):
3. failed to checkout the required commit
ValueError: If 1. entry point specified does not exist in the repo
2. source dir specified does not exist in the repo
3. dependencies specified do not exist in the repo
4. git_config is in bad format

Returns:
dict: A dict that contains the updated values of entry_point, source_dir and dependencies
"""
if entry_point is None:
raise ValueError("Please provide an entry point.")
_validate_git_config(git_config)
repo_dir = tempfile.mkdtemp()
subprocess.check_call(["git", "clone", git_config["repo"], repo_dir])

_checkout_branch_and_commit(git_config, repo_dir)

ret = {"entry_point": entry_point, "source_dir": source_dir, "dependencies": dependencies}
updated_paths = {
"entry_point": entry_point,
"source_dir": source_dir,
"dependencies": dependencies,
}

# check if the cloned repo contains entry point, source directory and dependencies
if source_dir:
if not os.path.isdir(os.path.join(repo_dir, source_dir)):
raise ValueError("Source directory does not exist in the repo.")
if not os.path.isfile(os.path.join(repo_dir, source_dir, entry_point)):
raise ValueError("Entry point does not exist in the repo.")
ret["source_dir"] = os.path.join(repo_dir, source_dir)
updated_paths["source_dir"] = os.path.join(repo_dir, source_dir)
else:
if not os.path.isfile(os.path.join(repo_dir, entry_point)):
if os.path.isfile(os.path.join(repo_dir, entry_point)):
updated_paths["entry_point"] = os.path.join(repo_dir, entry_point)
else:
raise ValueError("Entry point does not exist in the repo.")
ret["entry_point"] = os.path.join(repo_dir, entry_point)

ret["dependencies"] = []
updated_paths["dependencies"] = []
for path in dependencies:
if not os.path.exists(os.path.join(repo_dir, path)):
if os.path.exists(os.path.join(repo_dir, path)):
updated_paths["dependencies"].append(os.path.join(repo_dir, path))
else:
raise ValueError("Dependency {} does not exist in the repo.".format(path))
ret["dependencies"].append(os.path.join(repo_dir, path))
return ret
return updated_paths


def _validate_git_config(git_config):
Expand All @@ -84,6 +96,13 @@ def _validate_git_config(git_config):
"""
if "repo" not in git_config:
raise ValueError("Please provide a repo for git_config.")
allowed_keys = ["repo", "branch", "commit"]
for key in allowed_keys:
if key in git_config and not isinstance(git_config[key], six.string_types):
raise ValueError("'{}' should be a string".format(key))
for key in git_config:
if key not in allowed_keys:
raise ValueError("Unexpected argument(s) provided for git_config!")


def _checkout_branch_and_commit(git_config, repo_dir):
Expand All @@ -95,8 +114,8 @@ def _checkout_branch_and_commit(git_config, repo_dir):
repo_dir (str): the directory where the repo is cloned

Raises:
ValueError: If 1. entry point specified does not exist in the repo
2. source dir specified does not exist in the repo
CalledProcessError: If 1. failed to checkout the required branch
2. failed to checkout the required commit
"""
if "branch" in git_config:
subprocess.check_call(args=["git", "checkout", git_config["branch"]], cwd=str(repo_dir))
Expand Down
58 changes: 53 additions & 5 deletions src/sagemaker/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
import os

import sagemaker
from sagemaker import fw_utils, local, session, utils
from sagemaker import fw_utils, local, session, utils, git_utils
from sagemaker.fw_utils import UploadedCode
from sagemaker.transformer import Transformer

Expand Down Expand Up @@ -494,6 +494,7 @@ def __init__(
code_location=None,
sagemaker_session=None,
dependencies=None,
git_config=None,
**kwargs
):
"""Initialize a ``FrameworkModel``.
Expand All @@ -504,15 +505,54 @@ def __init__(
role (str): An IAM role name or ARN for SageMaker to access AWS resources on your behalf.
entry_point (str): Path (absolute or relative) to the Python source file which should be executed
as the entry point to model hosting. This should be compatible with either Python 2.7 or Python 3.5.
If 'git_config' is provided, 'entry_point' should be a relative location to the Python source file in
the Git repo.
Example:

With the following GitHub repo directory structure:

>>> |----- README.md
>>> |----- src
>>> |----- inference.py
>>> |----- test.py

You can assign entry_point='src/inference.py'.
git_config (dict[str, str]): Git configurations used for cloning files, including 'repo', 'branch'
and 'commit' (default: None).
'branch' and 'commit' are optional. If 'branch' is not specified, 'master' branch will be used. If
'commit' is not specified, the latest commit in the required branch will be used.
Example:

The following config:

>>> git_config = {'repo': 'https://github.com/aws/sagemaker-python-sdk.git',
>>> 'branch': 'test-branch-git-config',
>>> 'commit': '329bfcf884482002c05ff7f44f62599ebc9f445a'}

results in cloning the repo specified in 'repo', then checkout the 'master' branch, and checkout
the specified commit.
source_dir (str): Path (absolute or relative) to a directory with any other training
source code dependencies aside from the entry point file (default: None). Structure within this
directory will be preserved when training on SageMaker.
If the directory points to S3, no code will be uploaded and the S3 location will be used instead.
directory will be preserved when training on SageMaker. If 'git_config' is provided,
'source_dir' should be a relative location to a directory in the Git repo. If the directory points
to S3, no code will be uploaded and the S3 location will be used instead.
Example:

With the following GitHub repo directory structure:

>>> |----- README.md
>>> |----- src
>>> |----- inference.py
>>> |----- test.py

You can assign entry_point='inference.py', source_dir='src'.
dependencies (list[str]): A list of paths to directories (absolute or relative) with
any additional libraries that will be exported to the container (default: []).
The library folders will be copied to SageMaker in the same folder where the entrypoint is copied.
If the ```source_dir``` points to S3, code will be uploaded and the S3 location will be used
instead. Example:
If 'git_config' is provided, 'dependencies' should be a list of relative locations to directories
with any additional libraries needed in the Git repo. If the ```source_dir``` points to S3, code
will be uploaded and the S3 location will be used instead.
Example:

The following call
>>> Estimator(entry_point='train.py', dependencies=['my/libs/common', 'virtual-env'])
Expand Down Expand Up @@ -554,12 +594,20 @@ def __init__(
self.entry_point = entry_point
self.source_dir = source_dir
self.dependencies = dependencies or []
self.git_config = git_config
self.enable_cloudwatch_metrics = enable_cloudwatch_metrics
self.container_log_level = container_log_level
if code_location:
self.bucket, self.key_prefix = fw_utils.parse_s3_url(code_location)
else:
self.bucket, self.key_prefix = None, None
if self.git_config:
updates = git_utils.git_clone_repo(
self.git_config, self.entry_point, self.source_dir, self.dependencies
)
self.entry_point = updates["entry_point"]
self.source_dir = updates["source_dir"]
self.dependencies = updates["dependencies"]
self.uploaded_code = None
self.repacked_model_data = None

Expand Down
78 changes: 74 additions & 4 deletions tests/integ/test_git.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,14 @@
from tests.integ import lock as lock
from sagemaker.mxnet.estimator import MXNet
from sagemaker.pytorch.estimator import PyTorch
from sagemaker.sklearn.estimator import SKLearn
from sagemaker.mxnet.model import MXNetModel
from sagemaker.sklearn.model import SKLearnModel
from tests.integ import DATA_DIR, PYTHON_VERSION

GIT_REPO = "https://github.com/aws/sagemaker-python-sdk.git"
BRANCH = "test-branch-git-config"
COMMIT = "329bfcf884482002c05ff7f44f62599ebc9f445a"
COMMIT = "ae15c9d7d5b97ea95ea451e4662ee43da3401d73"

# endpoint tests all use the same port, so we use this lock to prevent concurrent execution
LOCK_PATH = os.path.join(tempfile.gettempdir(), "sagemaker_test_git_lock")
Expand Down Expand Up @@ -62,15 +65,16 @@ def test_git_support_with_pytorch(sagemaker_local_session):


@pytest.mark.local_mode
def test_git_support_with_mxnet(sagemaker_local_session, mxnet_full_version):
def test_git_support_with_mxnet(sagemaker_local_session):
script_path = "mnist.py"
data_path = os.path.join(DATA_DIR, "mxnet_mnist")
git_config = {"repo": GIT_REPO, "branch": BRANCH, "commit": COMMIT}
source_dir = "mxnet"
dependencies = ["foo/bar.py"]
mx = MXNet(
entry_point=script_path,
role="SageMakerRole",
source_dir="mxnet",
source_dir=source_dir,
dependencies=dependencies,
framework_version=MXNet.LATEST_VERSION,
py_version=PYTHON_VERSION,
Expand All @@ -94,10 +98,76 @@ def test_git_support_with_mxnet(sagemaker_local_session, mxnet_full_version):

with lock.lock(LOCK_PATH):
try:
predictor = mx.deploy(initial_instance_count=1, instance_type="local")
serving_script_path = "mnist_hosting_with_custom_handlers.py"
client = sagemaker_local_session.sagemaker_client
desc = client.describe_training_job(TrainingJobName=mx.latest_training_job.name)
model_data = desc["ModelArtifacts"]["S3ModelArtifacts"]
model = MXNetModel(
model_data,
"SageMakerRole",
entry_point=serving_script_path,
source_dir=source_dir,
dependencies=dependencies,
py_version=PYTHON_VERSION,
sagemaker_session=sagemaker_local_session,
framework_version=MXNet.LATEST_VERSION,
git_config=git_config,
)
predictor = model.deploy(initial_instance_count=1, instance_type="local")

data = numpy.zeros(shape=(1, 1, 28, 28))
result = predictor.predict(data)
assert result is not None
finally:
predictor.delete_endpoint()


@pytest.mark.skipif(PYTHON_VERSION != "py3", reason="Scikit-learn image supports only python 3.")
@pytest.mark.local_mode
def test_git_support_with_sklearn(sagemaker_local_session, sklearn_full_version):
script_path = "mnist.py"
data_path = os.path.join(DATA_DIR, "sklearn_mnist")
git_config = {
"repo": "https://github.com/GaryTu1020/python-sdk-testing.git",
"branch": "branch1",
"commit": "aafa4e96237dd78a015d5df22bfcfef46845c3c5",
}
source_dir = "sklearn"
sklearn = SKLearn(
entry_point=script_path,
role="SageMakerRole",
source_dir=source_dir,
py_version=PYTHON_VERSION,
train_instance_count=1,
train_instance_type="local",
sagemaker_session=sagemaker_local_session,
framework_version=sklearn_full_version,
hyperparameters={"epochs": 1},
git_config=git_config,
)
train_input = "file://" + os.path.join(data_path, "train")
test_input = "file://" + os.path.join(data_path, "test")
sklearn.fit({"train": train_input, "test": test_input})

assert os.path.isdir(sklearn.source_dir)

with lock.lock(LOCK_PATH):
try:
client = sagemaker_local_session.sagemaker_client
desc = client.describe_training_job(TrainingJobName=sklearn.latest_training_job.name)
model_data = desc["ModelArtifacts"]["S3ModelArtifacts"]
model = SKLearnModel(
model_data,
"SageMakerRole",
entry_point=script_path,
source_dir=source_dir,
sagemaker_session=sagemaker_local_session,
git_config=git_config,
)
predictor = model.deploy(1, "local")

data = numpy.zeros((100, 784), dtype="float32")
result = predictor.predict(data)
assert result is not None
finally:
predictor.delete_endpoint()
8 changes: 4 additions & 4 deletions tests/unit/test_estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
OUTPUT_PATH = "s3://bucket/prefix"
GIT_REPO = "https://github.com/aws/sagemaker-python-sdk.git"
BRANCH = "test-branch-git-config"
COMMIT = "329bfcf884482002c05ff7f44f62599ebc9f445a"
COMMIT = "ae15c9d7d5b97ea95ea451e4662ee43da3401d73"

DESCRIBE_TRAINING_JOB_RESULT = {"ModelArtifacts": {"S3ModelArtifacts": MODEL_DATA}}
INSTANCE_TYPE = "c4.4xlarge"
Expand Down Expand Up @@ -898,12 +898,12 @@ def test_git_support_bad_repo_url_format(sagemaker_session):


@patch(
"subprocess.check_call",
"sagemaker.git_utils.git_clone_repo",
side_effect=subprocess.CalledProcessError(
returncode=1, cmd="git clone https://github.com/aws/no-such-repo.git"
returncode=1, cmd="git clone https://github.com/aws/no-such-repo.git /tmp/repo_dir"
),
)
def test_git_support_git_clone_fail(check_call, sagemaker_session):
def test_git_support_git_clone_fail(sagemaker_session):
git_config = {"repo": "https://github.com/aws/no-such-repo.git", "branch": BRANCH}
fw = DummyFramework(
entry_point="entry_point",
Expand Down
Loading