-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feature: enhance-bucket-override-support #3235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: enhance-bucket-override-support #3235
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Codecov Report
@@ Coverage Diff @@
## master #3235 +/- ##
==========================================
- Coverage 89.75% 88.94% -0.82%
==========================================
Files 645 203 -442
Lines 54915 18108 -36807
==========================================
- Hits 49289 16106 -33183
+ Misses 5626 2002 -3624
Continue to review full report at Codecov.
|
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to double check that this will work with our test engine.
Right now, our test engine saves the SDK metadata with this directory structure:
|_manifests
|__models_manifest.json
|-specs
|--community_models
|---<model-id>
|----specs_<version>.json
|---- ...
However, the buckets have this pseudo dir structure:
|_models_manifest.json
|-community_models
|--<model-id>
|---specs_<version>.json
|--- ...
Will the SDK be able to (1) retrieve the manifests (2) retrieve the specs based with only the link to the root dir?
bucket = get_jumpstart_content_bucket(region) | ||
bucket = os.environ.get( | ||
ENV_VARIABLE_JUMPSTART_MODEL_ARTIFACT_BUCKET_OVERRIDE | ||
) or get_jumpstart_content_bucket(region) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please update the docstring to explain that this function may use an environment variable to override the source bucket?
src/sagemaker/jumpstart/cache.py
Outdated
def _get_json_file(self, key: str) -> Tuple[Union[dict, list], Optional[str]]: | ||
"""Returns json file either from s3 or local file system. | ||
|
||
Returns etag along with json object for s3, otherwise just returns json object and None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returns etag along with json object for s3, or just the json object and None when reading from the local file system.
src/sagemaker/jumpstart/constants.py
Outdated
"AWS_JUMPSTART_MODEL_ARTIFACT_BUCKET_OVERRIDE" | ||
) | ||
ENV_VARIABLE_JUMPSTART_SCRIPT_ARTIFACT_BUCKET_OVERRIDE = ( | ||
"AWS_JUMPSTART_SCRIPT_ARTIFACT_BUCKET_OVERRIDE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking: how about "AWS_JUMPSTART_SCRIPTS_BUCKET_OVERRIDE"
src/sagemaker/jumpstart/constants.py
Outdated
@@ -124,5 +124,12 @@ | |||
SUPPORTED_JUMPSTART_SCOPES = set(scope.value for scope in JumpStartScriptScope) | |||
|
|||
ENV_VARIABLE_JUMPSTART_CONTENT_BUCKET_OVERRIDE = "AWS_JUMPSTART_CONTENT_BUCKET_OVERRIDE" | |||
ENV_VARIABLE_JUMPSTART_MODEL_ARTIFACT_BUCKET_OVERRIDE = ( | |||
"AWS_JUMPSTART_MODEL_ARTIFACT_BUCKET_OVERRIDE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking: how about: "AWS_JUMPSTART_MODEL_ARTIFACTS_BUCKET_OVERRIDE"
src/sagemaker/jumpstart/constants.py
Outdated
ENV_VARIABLE_JUMPSTART_SCRIPT_ARTIFACT_BUCKET_OVERRIDE = ( | ||
"AWS_JUMPSTART_SCRIPT_ARTIFACT_BUCKET_OVERRIDE" | ||
) | ||
ENV_VARIABLE_JUMPSTART_METADATA_LOCAL_ROOT_OVERRIDE = "AWS_JUMPSTART_METADATA_LOCAL_ROOT_OVERRIDE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking: how about "AWS_JUMPSTART_METADATA_LOCAL_DIR_OVERRIDE"
?
"r", | ||
) | ||
assert mocked_open.call_count == 2 | ||
mocked_get_json_file_and_etag_from_s3.assert_not_called() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please add two cases:
- where the root dir doesn't exist
- where the root dir isn't a dir
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@@ -65,7 +65,7 @@ def __str__(self) -> str: | |||
{'content_bucket': 'bucket', 'region_name': 'us-west-2'}" | |||
""" | |||
|
|||
att_dict = {att: getattr(self, att) for att in self.__slots__} | |||
att_dict = {att: getattr(self, att) for att in self.__slots__ if hasattr(self, att)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious: why do you need this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this bug got exposed after i fixed a unit test
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
…manifest private (add underscore prefix)
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Issue #, if available:
Description of changes:
This PR adds support for overriding the bucket which hosts model and script artifacts (the bucket from which metadata is retrieved is either set by the region, or a separate, pre-existing environment variable override).
In addition, this PR provides support for reading metadata files from the local filesystem using environment variable overrides, as oppose to the default of S3.
Testing done:
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.