Skip to content

fix: repack_model script used in pipelines to support source_dir and dependencies #2645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Oct 12, 2021
Merged

fix: repack_model script used in pipelines to support source_dir and dependencies #2645

merged 15 commits into from
Oct 12, 2021

Conversation

staubhp
Copy link
Contributor

@staubhp staubhp commented Sep 17, 2021

Issue #, if available:
N/A

Description of changes:
When creating a model, users can optionally specify a custom entry point as well as a source directory and a list of dependencies. These custom scripts are "repacked" into the model tarball by the SDK.

Because Pipelines execute outside of the context of the SDK, they add a "RepackModel" step when users are registering a model that uses script-mode. This repack step starts a training job that runs the _repack_model.py script which puts the custom scripts into the tarball.

Previously, this script only handled custom entry_points, allowing users to customize inference code but not allowing them to add custom dependencies. This change adds support for repacking the dependencies and source_dir args the same way that the SDK does.

Testing done:
Unit, manual.

Note that the code being tested is a training job script that works with absolute root filepaths. For that reason, the unit tests are not enabled because they will fail in most cases due to permissions to the /opt/ directory.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Payton Staub added 2 commits September 17, 2021 14:22
…encies or source_dir arguments, which are used for script-mode in models
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: db68957
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: ecbc5e5
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: bf0899b
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: bf0899b
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 1326181
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 70e22f5
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 0d38b3a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@staubhp staubhp requested a review from danabens September 20, 2021 17:07
Args:
inference_script (str): The path to the custom entry point.
model_archive (str): The name of the model TAR archive.
dependencies (str): A space-delimited string of paths to custom dependencies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit this description makes more sense on a cli argument than a method parameter

if os.path.isdir(actual_dependency_path):
shutil.copytree(
actual_dependency_path,
os.path.join(lib_dir, os.path.basename(actual_dependency_path)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is os.path.join needed here but not on line 77?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.path.join is being used for both calls, but on line 77 it's being done in the assignment of the s and d variables

Comment on lines +103 to +106
parser.add_argument("--inference_script", type=str, default="inference.py")
parser.add_argument("--dependencies", type=str, default=None)
parser.add_argument("--source_dir", type=str, default=None)
parser.add_argument("--model_archive", type=str, default="model.tar.gz")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also has a help param if that would benefit users

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. I don't think it would be much use since this script is intended for, and will only work in, a training job

)

# repack
_repack_model.repack(inference_script="inference.py", model_archive=model_tar_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, would be nice to also cover __main__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons I moved the logic to a function was because I couldn't figure out how to invoke the __main__ of a module from within python :) As far as I understand,__name__ is only set to __main__ if it's called from a command line. Suggestions welcome

danabens
danabens previously approved these changes Sep 20, 2021
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 7ffed8c
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: c6c1f0a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@shreyapandit shreyapandit marked this pull request as draft September 21, 2021 03:24
@staubhp staubhp marked this pull request as ready for review September 24, 2021 18:07
danabens
danabens previously approved these changes Oct 8, 2021
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: e472fbe
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@shreyapandit shreyapandit changed the title fix: fix _repack_model script used in pipelines not supporting depend… fix: repack_model script used in pipelines to support source_dir and dependencies Oct 12, 2021
@ahsan-z-khan ahsan-z-khan merged commit 50840db into aws:master Oct 12, 2021
EthanShouhanCheng pushed a commit to SissiChenxy/sagemaker-python-sdk that referenced this pull request Jan 11, 2022
…dependencies (aws#2645)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Jeniya Tabassum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants