-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add Local Mode Batch Inference support. #414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
basic stuff works
It is magic! No, it actually checks how to assemble the already split data to feed into the container.
Fixed all broken tests and code complete. Need to add a few more tests
Codecov Report
@@ Coverage Diff @@
## master #414 +/- ##
==========================================
+ Coverage 93.44% 93.71% +0.27%
==========================================
Files 53 55 +2
Lines 3737 4010 +273
==========================================
+ Hits 3492 3758 +266
- Misses 245 252 +7
Continue to review full report at Codecov.
|
CHANGELOG.rst
Outdated
@@ -6,6 +6,7 @@ CHANGELOG | |||
========= | |||
|
|||
* enhancement: Local Mode: add training environment variables for AWS region and job name | |||
* feature: Add support for Local Mode Batch Inference |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feature: Local Mode: add support for Batch Transform
|
||
BUCKET_NAME = 'some-nice-bucket' | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no tests for copy_directory_structure
or recursive_copy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some docstrings are missing.
One unit test broke when merging from master. Fixing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to just do a pass through all of the docstrings
src/sagemaker/local/data.py
Outdated
else: | ||
files = [self.root_path] | ||
|
||
return files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump (especially because you do it in get_root_dir
)
environment['SAGEMAKER_BATCH_STRATEGY'] = strategy_env_value | ||
|
||
# we only do 1 max concurrent transform in Local Mode | ||
environment['SAGEMAKER_MAX_CONCURRENT_TRANSFORMS'] = '1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump
src/sagemaker/local/utils.py
Outdated
path = parsed_uri.path.strip('/') | ||
sagemaker_session.upload_data(source, bucket, path) | ||
else: | ||
raise ValueError('Invalid destination URI, must be s3:// or file://') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include the URI in the error message
|
||
def describe_transform_job(self, TransformJobName): | ||
if TransformJobName not in LocalSagemakerClient._transform_jobs: | ||
error_response = {'Error': {'Code': 'ValidationException', 'Message': 'Could not find local transform job'}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd include TransformJobName
in the error message
|
||
|
||
@pytest.mark.continuous_testing | ||
def test_local_transform_mxnet(sagemaker_local_session, tmpdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: s/transform/batch_transform (just in case anyone thinks of transform as real-time inference since the input-predict-output cycle is "transform")
@pytest.mark.continuous_testing | ||
def test_local_transform_mxnet(sagemaker_local_session, tmpdir): | ||
data_path = os.path.join(DATA_DIR, 'mxnet_mnist') | ||
script_path = os.path.join(data_path, 'mnist.py') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a lot of these tests seem to do the same boilerplate MXNet/MNIST setup - perhaps something can be refactored?
tests/unit/test_local_data.py
Outdated
|
||
|
||
def test_recordio_splitter(tmpdir): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: remove the extra newline
def test_get_batch_strategy_instance_with_invalid_type(): | ||
with pytest.raises(ValueError): | ||
# something invalid | ||
sagemaker.local.data.get_batch_strategy_instance('NiceRecord', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personal preference - I'd just make the string 'SomethingInvalid' instead of writing a comment to explain it
src/sagemaker/local/data.py
Outdated
else: | ||
files = [self.root_path] | ||
|
||
return files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove the return files
line now :)
Creating local batch transform jobs will perform all the steps locally. The serving container will be created the same way as it is for serving but then the data will be fed to it and written to the corresponding output location. Data splits and batch strategies are implemented as closely as possible to the behavior in SageMaker so that the end results are consistent.
Add support for Local Mode Batch Inference (Local Transform Jobs).
Creating local transform jobs will perform all the steps locally. The serving container will be created the same way as it is for serving but then the data will be fed to it and written to the corresponding output location. Data splits and batch strategies are implemented as closely as possible to the behavior in SageMaker so that the end results are consistent.
Description of changes:
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.