Skip to content

local mode: support output_path #449

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 2, 2018
Merged

Conversation

iquintero
Copy link
Contributor

Can be either file:// or s3:// - This also changes the default
behavior of local mode to use the SDK provided default S3 bucket
if nothing is passed. This makes it easier for customers to create
models in SageMaker too since their Model Artifacts will already be
a tarfile in S3.

Issue #, if available:

Description of changes:

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have updated the changelog with a description of my changes (if appropriate)
  • I have updated any necessary documentation (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@iquintero iquintero requested a review from nadiaya October 29, 2018 22:01
@iquintero iquintero force-pushed the lm_training_output branch 2 times, most recently from 605d48d to 8f8cfc6 Compare October 29, 2018 22:12
Can be either file:// or s3:// - This also changes the default
behavior of local mode to use the SDK provided default S3 bucket
if nothing is passed. This makes it easier for customers to create
models in SageMaker too since their Model Artifacts will already be
a tarfile in S3.
@codecov-io
Copy link

Codecov Report

Merging #449 into master will decrease coverage by 0.07%.
The diff coverage is 91.35%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #449      +/-   ##
==========================================
- Coverage   93.75%   93.68%   -0.08%     
==========================================
  Files          55       55              
  Lines        4034     4051      +17     
==========================================
+ Hits         3782     3795      +13     
- Misses        252      256       +4
Impacted Files Coverage Δ
src/sagemaker/fw_utils.py 100% <100%> (ø) ⬆️
src/sagemaker/local/utils.py 96.96% <100%> (+0.3%) ⬆️
src/sagemaker/local/entities.py 95.23% <100%> (+0.06%) ⬆️
src/sagemaker/estimator.py 89.59% <100%> (+0.07%) ⬆️
src/sagemaker/utils.py 91.15% <100%> (+0.85%) ⬆️
src/sagemaker/session.py 89.51% <100%> (+0.05%) ⬆️
src/sagemaker/local/local_session.py 88.88% <100%> (ø) ⬆️
src/sagemaker/local/data.py 93.91% <66.66%> (-0.73%) ⬇️
src/sagemaker/local/image.py 89.8% <86.95%> (-1.02%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5595af...95ad797. Read the comment docs.

@@ -104,6 +104,9 @@ def __init__(self, role, train_instance_count, train_instance_type,

self.base_job_name = base_job_name
self._current_job_name = None
if (not self.sagemaker_session.local_mode
and output_path and output_path.startswith('file://')):
raise RuntimeError('file:// output paths are only supported in Local Mode')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

model_files = [os.path.join(model_artifacts, name) for name in os.listdir(model_artifacts)]
output_files = [os.path.join(output_artifacts, name) for name in os.listdir(output_artifacts)]
sagemaker.utils.create_tar_file(model_files, os.path.join(compressed_artifacts, 'model.tar.gz'))
sagemaker.utils.create_tar_file(output_files, os.path.join(compressed_artifacts, 'output.tar.gz'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that be a function? (all 5 lines)
that seems a bit repetitive.


return s3_model_artifacts
return os.path.join(output_data, 'model.tar.gz')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we moved both files should we return both (model and output) too??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, SageMaker just returns the S3ModelArtifacts - if you want to look at output.tar.gz you basically have to do a replace in the string. This gets sent directly to the local client describeTrainingJob()

nadiaya
nadiaya previously approved these changes Nov 2, 2018
@iquintero iquintero merged commit 3d091b4 into aws:master Nov 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants