-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix: repack_model script used in pipelines to support source_dir and dependencies #2645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…encies or source_dir arguments, which are used for script-mode in models
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Args: | ||
inference_script (str): The path to the custom entry point. | ||
model_archive (str): The name of the model TAR archive. | ||
dependencies (str): A space-delimited string of paths to custom dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit this description makes more sense on a cli argument than a method parameter
if os.path.isdir(actual_dependency_path): | ||
shutil.copytree( | ||
actual_dependency_path, | ||
os.path.join(lib_dir, os.path.basename(actual_dependency_path)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is os.path.join
needed here but not on line 77?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.path.join
is being used for both calls, but on line 77 it's being done in the assignment of the s
and d
variables
parser.add_argument("--inference_script", type=str, default="inference.py") | ||
parser.add_argument("--dependencies", type=str, default=None) | ||
parser.add_argument("--source_dir", type=str, default=None) | ||
parser.add_argument("--model_archive", type=str, default="model.tar.gz") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also has a help
param if that would benefit users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. I don't think it would be much use since this script is intended for, and will only work in, a training job
) | ||
|
||
# repack | ||
_repack_model.repack(inference_script="inference.py", model_archive=model_tar_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, would be nice to also cover __main__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the reasons I moved the logic to a function was because I couldn't figure out how to invoke the __main__
of a module from within python :) As far as I understand,__name__
is only set to __main__
if it's called from a command line. Suggestions welcome
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
…dependencies (aws#2645) Co-authored-by: Payton Staub <[email protected]> Co-authored-by: Shreya Pandit <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> Co-authored-by: Jeniya Tabassum <[email protected]>
Issue #, if available:
N/A
Description of changes:
When creating a model, users can optionally specify a custom entry point as well as a source directory and a list of dependencies. These custom scripts are "repacked" into the model tarball by the SDK.
Because Pipelines execute outside of the context of the SDK, they add a "RepackModel" step when users are registering a model that uses script-mode. This repack step starts a training job that runs the
_repack_model.py
script which puts the custom scripts into the tarball.Previously, this script only handled custom entry_points, allowing users to customize inference code but not allowing them to add custom dependencies. This change adds support for repacking the
dependencies
andsource_dir
args the same way that the SDK does.Testing done:
Unit, manual.
Note that the code being tested is a training job script that works with absolute root filepaths. For that reason, the unit tests are not enabled because they will fail in most cases due to permissions to the
/opt/
directory.Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.