-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Feat/jsch jumpstart estimator support #4439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/jsch jumpstart estimator support #4439
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master-jumpstart-curated-hub #4439 +/- ##
===============================================================
Coverage ? 86.83%
===============================================================
Files ? 389
Lines ? 35909
Branches ? 0
===============================================================
Hits ? 31183
Misses ? 4726
Partials ? 0 ☔ View full report in Codecov by Sentry. |
Also are some tests (like sagemaker-python-sdk-notebook-tests) expected to fail? |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
id_info | ||
) | ||
hub = CuratedHub(hub_name=hub_name, region=region) | ||
hub_content = hub.describe_model(model_name=model_name, model_version=model_version) | ||
hub = CuratedHub(hub_name=info.hub_name, region=info.region) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we really need to get rid of this class instantiation in this module. There should be a common utility/middleware to do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jinyoung-lim is going to implement this, but I don't want to step on her toes yet
hub = CuratedHub(hub_name=info.hub_name, region=info.region) | ||
hub_content = hub.describe_model( | ||
model_name=info.hub_content_name, model_version=info.hub_content_version | ||
) | ||
utils.emit_logs_based_on_model_specs( | ||
hub_content.content_document, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumed parsed HubContent. Suggest either explicitly commenting that or waiting for that implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a feature branch so I feel comfortable merging with out any of the implementation done, but yes this will be parsed. The HubContentDocument
should be parsed at this point too
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@@ -355,7 +355,7 @@ def _retrieval_function( | |||
return JumpStartCachedContentValue( | |||
formatted_content=model_specs | |||
) | |||
if data_type == HubDataType.HUB: | |||
if data_type == HubContentType.HUB: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this from the previous PR. But I am a bit confused about having HUB type here. Other three types are related to models (i.e. model id or sort maps to model specs).
- Are we expecting to extract models information from Hub description?
- Or are we just getting the hub information? Then what maps to what? Also if yes, we need to also modify typing for JumpStartCachedContentValue's
formatted_content
as it currently only hasUnion[ Dict[JumpStartVersionedModelId, JumpStartModelHeader], JumpStartModelSpecs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous implementation had two pathways: JumpStartS3FileType.MANIFEST
(which fetched and cached the manifest) and JumpStartS3FileType.SPECS
(which fetched and cached model version specific spesc). We'll follow the same process here, where we cache the hub details (dependent on hub arn) and hub content details (dependent on hub content arn)
To answer your other questions:
- No, this is only to extract Hub information
HubContentType.HUB
will store Hub information for a givenhubArn
.HubContentType.MODEL
will store HubContent information for a givenhubContentArn
. You're correct, we will have to implement other functions to retrieve the Hub information correctly, but this PR does not contain that logic
@@ -48,7 +48,7 @@ def retrieve_default( | |||
model_version (str): Optional. The version of the model for which to retrieve the | |||
default environment variables. (Default: None). | |||
hub_arn (str): The arn of the SageMaker Hub for which to retrieve | |||
model details from (default: None). | |||
model details from. (default: None). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: capitalize Default
* prepare release v2.210.0 * update development version to v2.210.1.dev0 * feat: Add new Triton DLC URIs (#4432) * Add new Triton DLC URIs * Update according to black and pylint * feat: Support selective pipeline execution between function step and regular step (#4392) * feat: Add AutoMLV2 support (#4461) * Add AutoMLV2 support * Improvements of the integration tests --------- Co-authored-by: Anton Repushko <[email protected]> * feature: Add TensorFlow 2.14 image configs (#4446) * fix: remove enable_network_isolation from the python doc (#4465) Co-authored-by: Rohan Gujarathi <[email protected]> * doc: Add doc for new feature processor APIs and classes (#4250) * fix: properly close sagemaker config file after loading config (#4457) Closes #4456 * feat: instance specific jumpstart host requirements (#4397) * feat: instance specific jumpstart host requirements * chore: add js support for copies resource requirement, enforce coupling with ResourceRequirements class * fix: typing * fix: pylint * change: Bump Apache Airflow version to 2.8.2 (#4470) * Update tox.ini * Update test_requirements.txt * fix: make sure gpus are found in local_gpu run (#4384) * fix: make sure gpus are found in local_gpu run * fix: black formatting * fix: adjust unit test * feat: pin dll version to support python3.11 to the sdk (#4472) Co-authored-by: Ashwin Krishna <[email protected]> * fix: Skip No Canvas regions for test_deploy_best_candidate (#4477) * prepare release v2.211.0 * update development version to v2.211.1.dev0 * change: Enhance model builder selection logic to include model size (#4429) * change: Enhance model builder selection logic to include model size * Fix conflicts * Address PR comments * fix formatting * fix formatting of test * Fix token in tasks.json * Increase coverage for tests * fix formatting * Fix requirements * Import code instead of importing accelerate * Fix formatting * Setup dependencies * change: Upgrade smp to version 2.2 (#4479) * upgrading smp to version 2.2 * fixing linting issue * fixing syntax error with multiline if statement * upgrading smp to version 2.2 * fixing linting issue * fixing syntax error with multiline if statement * fixing formatting --------- Co-authored-by: Andrew Tian <[email protected]> * feat: Update SM Python SDK for PT 2.2.0 SM DLC (#4481) * update pt2.2 sm training dlc pysdk * update pt2.2 sm inference dlc pysdk and region list * fix: Create custom tarfile extractall util to fix backward compatibility issue (#4476) * fix: Create custom tarfile extractall util to fix backward compatibility issue * Address review comments * fix logger.error statements * prepare release v2.212.0 * update development version to v2.212.1.dev0 * change: Update tblib constraint (#4452) * fix: make unit tests compatible with pytest-xdist (#4486) * fix: make unit tests compatible with pytest-xdist * fix failing test * feature: Add overriding logic in ModelBuilder when task is provided (#4460) * feat: Add Optional task to Model * Revert "feat: Add Optional task to Model" This reverts commit fd3e86b. * Add override logic in ModelBuilder with task provided * Adjusted formatting * Add extra unit tests for invalid inputs * Address PR comments * Add more test inputs to integration test * Add model_metadata field to ModelBuilder * Update doc * Update doc * Adjust formatting --------- Co-authored-by: Samrudhi Sharma <[email protected]> Co-authored-by: Xiong Zeng <[email protected]> * feature: Accept user-defined env variables for the entry-point (#4175) * fix: Move sagemaker pysdk version check after bootstrap in remote job (#4487) * change: enable github actions for PRs (#4489) * change: enable github actions for PRs * Update codebuild-ci.yml * trigger on pull_request_target * add source-version-override * fix permission * feature: Add ModelDataSource and SourceUri support for model package and while registering (#4492) Co-authored-by: Erick Benitez-Ramos <[email protected]> * feat: support JumpStart proprietary models (#4467) * feat: add proprietary manifest/specs parsing add unittests for test_cache small refactoring address comments and more unittests fix linting and fix more tests fix: pylint feat: JumpStartModel class for prop models * remove unused imports and fix docstyle * fix: remove unused args * fix: remove unused args * fix: more unused vars * fix: slow tests * fix: unittests * added more tests to cover some lines * remove estimator warn check * chore: address comments re performance * fix: address comments * complete list experience and other fixes * fix: pylint * add doc utils and fix pylint * fix: docstyle * fix: doc * fix: default payloads * fix: doc and tags and enums * fix: jumpstart doc * rename to open_weights and fix filtering * update filter name * doc update * fix: black * rename to proprietary model and fix unittests * address comments * fix: docstyle and flake8 * address more comments and fix doc * put back doc utils for future refactoring * add prop model title in doc * doc update --------- Co-authored-by: liujiaor <[email protected]> * chore: emit warning when no instance specific gated training env var is available, and raise exception when accept_eula flag is not supplied (#4485) * fix: raise exception when no instance specific gated training env var available * chore: raise client exception if accept_eula flag is not set for gated models * chore: address flake8 errors * chore: emit warning when instance type is chosen with no gated training artifacts * change: bump jinja2 to 3.1.3 in doc/requirments.txt (#4421) (#4423) * change: bump jinja2 to 3.1.3 in doc/requirments.txt (#4421) * change: bump jinja2 to 3.1.3 in doc/requirments.txt * Update requirements.txt * feature: TGI 1.4.0 (#4424) * documentation: fix the ClarifyCheckStep documentation to mention PDP (#4259) * documentation: fix the ClarifyCheckStep documentation to mention PDP support * fix: break the lines to meet pylint requirement --------- Co-authored-by: Shing Lyu <[email protected]> * documentation: Explain the ClarifyCheckStep and QualityCheckStep parameters (#4261) * documentation: explain the ClarifyCheckStep and QualityCheckStep parameters * fix: remove trailing space --------- Co-authored-by: Shing Lyu <[email protected]> * feat: Telemetry metrics (#4414) * Emit additional telemetry metrics * Fix unit tests * Emit endpoint failure to telemetry * Address PR Comments * Emit latency in telemetry * Address PR Comments * Addressed PR Comments * Address PR Comments * Fix tests * Fix integ tests --------- Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> * documentation: change order of pipelines topics (#4427) * prepare release v2.208.0 * update development version to v2.208.1.dev0 * feature: AutoGluon 1.0.0 image_uris update (#4426) --------- Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Jinyoung Lim <[email protected]> Co-authored-by: Shing Lyu <[email protected]> Co-authored-by: Shing Lyu <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: stacicho <[email protected]> Co-authored-by: ci <ci> Co-authored-by: tonyhu <[email protected]> * feat: add hub and hubcontent support in retrieval function for jumpstart model cache (#4438) * feat: jsch jumpstart estimator support (#4439) * Master jumpstart curated hub (#4464) * add hub_arn support for accept_types, content_types, serializers, deserializers, and predictor (#4463) * feature: JumpStart CuratedHub class creation and function definitions (#4448) * MultiPartCopy with Sync Algorithm (#4475) * first pass at sync function with util classes * adding tests and update clases * linting * file generator class inheritance * lint * multipart copy and algorithm updates * modularize sync * reformatting folders * testing for sync * do not tolerate vulnerable * remove prints * handle multithreading progress bar * update tests * optimize function and add hub bucket prefix * docstrings and linting * rebase with master * bad rebase * trying to fix codecov * uncomment codebuild-ci --------- Co-authored-by: ci <ci> Co-authored-by: Nikhil Kulkarni <[email protected]> Co-authored-by: qidewenwhen <[email protected]> Co-authored-by: Anton Repushko <[email protected]> Co-authored-by: Anton Repushko <[email protected]> Co-authored-by: Sai Parthasarathy Miduthuri <[email protected]> Co-authored-by: Rohan Gujarathi <[email protected]> Co-authored-by: Rohan Gujarathi <[email protected]> Co-authored-by: cansun <[email protected]> Co-authored-by: Justin <[email protected]> Co-authored-by: evakravi <[email protected]> Co-authored-by: Kalyani Nikure <[email protected]> Co-authored-by: gv <[email protected]> Co-authored-by: akrishna1995 <[email protected]> Co-authored-by: Ashwin Krishna <[email protected]> Co-authored-by: Samrudhi Sharma <[email protected]> Co-authored-by: adtian2 <[email protected]> Co-authored-by: Andrew Tian <[email protected]> Co-authored-by: Sirut Buasai <[email protected]> Co-authored-by: Danny Bushkanets <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: xiongz945 <[email protected]> Co-authored-by: Samrudhi Sharma <[email protected]> Co-authored-by: Xiong Zeng <[email protected]> Co-authored-by: martinRenou <[email protected]> Co-authored-by: mrudulmn <[email protected]> Co-authored-by: Haotian An <[email protected]> Co-authored-by: liujiaor <[email protected]> Co-authored-by: Jinyoung Lim <[email protected]> Co-authored-by: Shing Lyu <[email protected]> Co-authored-by: Shing Lyu <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: stacicho <[email protected]> Co-authored-by: tonyhu <[email protected]>
* fix: Move sagemaker pysdk version check after bootstrap in remote job (#4487) * feat: support JumpStart proprietary models (#4467) * feat: add proprietary manifest/specs parsing add unittests for test_cache small refactoring address comments and more unittests fix linting and fix more tests fix: pylint feat: JumpStartModel class for prop models * remove unused imports and fix docstyle * fix: remove unused args * fix: remove unused args * fix: more unused vars * fix: slow tests * fix: unittests * added more tests to cover some lines * remove estimator warn check * chore: address comments re performance * fix: address comments * complete list experience and other fixes * fix: pylint * add doc utils and fix pylint * fix: docstyle * fix: doc * fix: default payloads * fix: doc and tags and enums * fix: jumpstart doc * rename to open_weights and fix filtering * update filter name * doc update * fix: black * rename to proprietary model and fix unittests * address comments * fix: docstyle and flake8 * address more comments and fix doc * put back doc utils for future refactoring * add prop model title in doc * doc update --------- Co-authored-by: liujiaor <[email protected]> * feat: add hub and hubcontent support in retrieval function for jumpstart model cache (#4438) * feat: jsch jumpstart estimator support (#4439) * Master jumpstart curated hub (#4464) * add hub_arn support for accept_types, content_types, serializers, deserializers, and predictor (#4463) * feature: JumpStart CuratedHub class creation and function definitions (#4448) * MultiPartCopy with Sync Algorithm (#4475) * first pass at sync function with util classes * adding tests and update clases * linting * file generator class inheritance * lint * multipart copy and algorithm updates * modularize sync * reformatting folders * testing for sync * do not tolerate vulnerable * remove prints * handle multithreading progress bar * update tests * optimize function and add hub bucket prefix * docstrings and linting * rebase with master * bad rebase * support for gated and training unsupported * merge with master-curated-jumpstart * linting * update types * update * update bootstrap * fix codecov --------- Co-authored-by: qidewenwhen <[email protected]> Co-authored-by: Haotian An <[email protected]> Co-authored-by: liujiaor <[email protected]> Co-authored-by: Jinyoung Lim <[email protected]>
Issue #, if available:
Description of changes:
Adds support for
hub_name
field in theJumpStartEstimator
class. Users can pass in a Hub Name or Arn to this class in order to pull model details from a SageMaker Hub rather that JumpStart's S3 bucketTesting done:
Updated tests for all files that were touched during development
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.