-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: Support custom repack model settings #4328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8f1acb0
to
dc6a9b9
Compare
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #4328 +/- ##
==========================================
- Coverage 86.91% 86.80% -0.11%
==========================================
Files 1197 380 -817
Lines 106707 35185 -71522
==========================================
- Hits 92749 30544 -62205
+ Misses 13958 4641 -9317 ☔ View full report in Codecov by Sentry. |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot run unit-tests
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
"""Pop out non-configurable args from _repack_model_step_settings""" | ||
if not self._repack_model_step_settings: | ||
return | ||
for ignored_param in _IGNORED_REPACK_PARAM_LIST: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is directly configurable from the user, why would they include params in their step settings that they know would get ignored ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input interface to user is a dict so they can input anything they'd like. However, there are some keys/fields they should not touch, otherwise the repack model stack won't work as expected. For these field, we add this check as an early validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/bot run pr
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
…ngestion. (#4413) * change: update image_uri_configs 12-13-2023 12:23:06 PST * change: update image_uri_configs 12-13-2023 14:04:54 PST * prepare release v2.200.1 * update development version to v2.200.2.dev0 * fix: Move func and args serialization of function step to step level (#4312) * fix: Add write permission to job output dirs for remote and step decorator running on non-root job user (#4325) * feat: Added update for model package (#4309) Co-authored-by: Keshav Chandak <[email protected]> * documentation: fix ModelBuilder sample notebook links (#4319) * feat: Use specific images for SMP v2 jobs (#4333) * Add check for smp lib * update syntax * Remove unused images * Update repo name and regions * Update account number * Update framework name and check for None distribution * Add unit tests for smp v2 uri * Check enabled * Remove logging * Add cuda version in uri * Update cu121 * Update syntax * Fix black check * Fix black --------- Co-authored-by: huilgolr <[email protected]> * Fix: Updated js mb compression logic - ModelBuilder (#4294) Co-authored-by: EC2 Default User <[email protected]> * documentation: SMP v2 doc updates (#1423) (#4336) * doc update for estimator distribution art * add note to the SMP doc and minor fixes * remove subnodes * rm all v1 content as documenting everything in aws docs * fix build errors * fix white spaces * rm smdistributed from TF estimator distribution * rm white spaces * add notes to TF estimator distribution * fix links * incorporate feedback * update example values * fix version numbers in the notes Co-authored-by: Miyoung <[email protected]> * prepare release v2.201.0 * update development version to v2.201.1.dev0 * Fix: Add additional model builder telemetry (#4334) * move telemetry code to public * add additional test --------- Co-authored-by: EC2 Default User <[email protected]> * feature: support remote debug for sagemaker training job (#4315) * feature: support remote debug for sagemaker training job * change: Replace update_remote_config with 2 helper methods for enable and disable respectively * change: add new argument enable_remote_debug to skip set of test_jumpstart_estimator_kwargs_match_parent_class * chore: add jumpstart support for remote debug --------- Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Evan Kravitz <[email protected]> * Update tblib constraint (#4317) * Fix: Fix job_objective type (#4303) * change: update image_uri_configs 12-21-2023 08:32:41 PST * prepare release v2.202.0 * update development version to v2.202.1.dev0 * Using logging instead of prints (#4133) * documentation: update issue template. (#4337) * change: update model path in local mode (#4296) * Update model path in local mode * Add test * change: update image_uri_configs 12-22-2023 06:17:35 PST * prepare release v2.202.1 * update development version to v2.202.2.dev0 * change: create role if needed in `get_execution_role` (#4323) * Create role if needed in get_execution_role * Add tests * Change: More pythonic tags (#4327) * Change: More pythonic tags * Fix broken tags * More tags formatting and add a test * Fix tests * Raise Exception for debug (#4344) Co-authored-by: Ruilian Gao <[email protected]> * Change: Allow extra_args to be passed to uploader (#4338) * Change: Allow extra_args to be passed to uploader * Fix tests * Black * Fix test * Change: Drop py2 tag from the wheel as we don't support Python 2 (#4343) * Disable failed test in IR (#4345) * Disable failed test in IR * Fix format --------- Co-authored-by: Ruilian Gao <[email protected]> * change: update image_uri_configs 12-25-2023 06:17:33 PST * feat: Supporting tbac in load_run (#4039) * feature: support local mode in SageMaker Studio (#1300) (#4347) * feature: support local mode in SageMaker Studio * chore: fix typo * chore: fix formatting * chore: revert changes for docker compose logs * chore: black-format * change: Use predtermined dns-allow-listed-hostname for Studio Local Support * add support for CodeEditor and JupyterLabs --------- Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> * prepare release v2.203.0 * update development version to v2.203.1.dev0 * change: update image_uri_configs 12-29-2023 06:17:34 PST * query hf api for model md (#4346) Co-authored-by: EC2 Default User <[email protected]> * fix: skip failing integs (#4348) Co-authored-by: Mufaddal Rohawala <[email protected]> * change: TGI 1.3.3 (#4335) * prepare release v2.203.1 * update development version to v2.203.2.dev0 * feat: parallelize notebook search utils, add new operators (#4342) * feat: parallelize notebook search utils * chore: raise exception in notebook utils if thread has error * chore: improve variable name * fix: not passing region to get jumpstart bucket * chore: add sagemaker session to notebook utils * chore: address PR comments * feat: add support for includes, begins with, ends with * fix: pylint * feat: private util for model eula key * fix: unit tests, use verify_model_region_and_return_specs in notebook utils * Revert "feat: private util for model eula key" This reverts commit e2daefc. * chore: add search keywords to header * fix: change ConditionNot incorrect property Expression to Condition (#4351) * fix: Huggingface glue failing tests (#4367) * fix: Huggingface glue failing tests * fix: Sphinx doc build failure * fix: Huggingface glue failing tests * fix: failing sphinx tests * fix: failing sphinx tests * fix: failing black check * fix: sphinx doc errors * fix: sphinx doc errors * sphinx * black-format * sphinx * sphinx * sphinx --------- Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> * fix: Add PyTorch 2.1.0 SM Training DLC to UNSUPPORTED_DLC_IMAGE_FOR_SM_PARALLELISM list (#4356) * add 2.1 unsupported smddp * formatting * feat: Support custom repack model settings (#4328) * change: update sphinx version (#4377) * change: update sphinx version * Update sphinx * change: Updates for DJL 0.26.0 release (#4366) * change: TGI NeuronX (#4375) * TGI NeuronX * Update * Update * fix: add warning message for job-prefixed pipeline steps when no job name is provided (#4371) Co-authored-by: svia3 <[email protected]> * change: JumpStart - TLV region launch (#4379) * feat: add throughput management support for feature group (#4359) * feat: add throughput management support for feature group * documentation: add doc for feature group throughput config --------- Co-authored-by: Nilesh PS <[email protected]> * change: Enable galactus integ tests (#4376) * feat: Enable galactus integ tests * fix flake8 * fix doc8 * trying to see if it works with slow tests * small fixes in import error * fix missing import * try to remove some dependencies from requirement to see if pr test can be fixed * fix flake8 * Enable more tests * Add rerun annotation and further remove dependencies * comment out 2 integ tests * Remove local mode test for now * fix flake8 * prepare release v2.204.0 * update development version to v2.204.1.dev0 * fix: Add validation for empty ParameterString value in start local pipeline (#4354) * feat: Support selective pipeline execution for function step (#4372) * change: update image_uri_configs 01-24-2024 06:17:33 PST * fix: update get_execution_role_arn from metadata file if present (#4388) * fix: Support using PipelineDefinitionConfig in local mode (#4352) * fix: remove fastapi and uvicorn dependencies (#4365) They are not used in the codebase. Closes #4361 #4295 * prepare release v2.205.0 * update development version to v2.205.1.dev0 * change: TGI NeuronX 0.0.17 (#4390) * fix: Support PipelineVariable for ModelQualityCheckConfig attributes (#4353) * feat: Logic to detect hardware GPU count and aggregate GPU memory size in MiB (#4389) * Add logic to detect hardware GPU count and aggregate GPU memory size in MiB * Fix all formatting * Addressed PR review comments * Addressed PR Review messages * Addressed PR Review Messages * Addressed PR Review comments * Addressed PR Review Comments * Add integration tests * Add config * Fix integration tests * Include Instance Types GPU infor Config files * Addressed PR review comments * Fix unit tests * Fix unit test: 'Mock' object is not subscriptable --------- Co-authored-by: Jonathan Makunga <[email protected]> * fix: fixed create monitoring schedule failing after validation error (#4385) Co-authored-by: Keshav Chandak <[email protected]> * Add collection type support for Feaure Group Ingestion. Add TargetStores support for PutRecord and Ingestion. * Remove merge conflicts. * Update the feature definition type * Black formatting * Fix Flake8 formatting * Fix Pylint * Fix Formatting. --------- Co-authored-by: sagemaker-bot <[email protected]> Co-authored-by: ci <ci> Co-authored-by: qidewenwhen <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: stacicho <[email protected]> Co-authored-by: Teng-xu <[email protected]> Co-authored-by: huilgolr <[email protected]> Co-authored-by: Gary Wang <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: akrishna1995 <[email protected]> Co-authored-by: Miyoung <[email protected]> Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Evan Kravitz <[email protected]> Co-authored-by: martinRenou <[email protected]> Co-authored-by: Duc Trung Le <[email protected]> Co-authored-by: ruiliann666 <[email protected]> Co-authored-by: Ruilian Gao <[email protected]> Co-authored-by: ananth102 <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: amzn-choeric <[email protected]> Co-authored-by: evakravi <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Sirut Buasai <[email protected]> Co-authored-by: Sindhu Somasundaram <[email protected]> Co-authored-by: Stephen Via <[email protected]> Co-authored-by: svia3 <[email protected]> Co-authored-by: Haixin Wang <[email protected]> Co-authored-by: Nilesh PS <[email protected]> Co-authored-by: Nilesh PS <[email protected]> Co-authored-by: jiapinw <[email protected]> Co-authored-by: Jay Goyani <[email protected]> Co-authored-by: Justin <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]>
…ngestion. (aws#4413) * change: update image_uri_configs 12-13-2023 12:23:06 PST * change: update image_uri_configs 12-13-2023 14:04:54 PST * prepare release v2.200.1 * update development version to v2.200.2.dev0 * fix: Move func and args serialization of function step to step level (aws#4312) * fix: Add write permission to job output dirs for remote and step decorator running on non-root job user (aws#4325) * feat: Added update for model package (aws#4309) Co-authored-by: Keshav Chandak <[email protected]> * documentation: fix ModelBuilder sample notebook links (aws#4319) * feat: Use specific images for SMP v2 jobs (aws#4333) * Add check for smp lib * update syntax * Remove unused images * Update repo name and regions * Update account number * Update framework name and check for None distribution * Add unit tests for smp v2 uri * Check enabled * Remove logging * Add cuda version in uri * Update cu121 * Update syntax * Fix black check * Fix black --------- Co-authored-by: huilgolr <[email protected]> * Fix: Updated js mb compression logic - ModelBuilder (aws#4294) Co-authored-by: EC2 Default User <[email protected]> * documentation: SMP v2 doc updates (aws#1423) (aws#4336) * doc update for estimator distribution art * add note to the SMP doc and minor fixes * remove subnodes * rm all v1 content as documenting everything in aws docs * fix build errors * fix white spaces * rm smdistributed from TF estimator distribution * rm white spaces * add notes to TF estimator distribution * fix links * incorporate feedback * update example values * fix version numbers in the notes Co-authored-by: Miyoung <[email protected]> * prepare release v2.201.0 * update development version to v2.201.1.dev0 * Fix: Add additional model builder telemetry (aws#4334) * move telemetry code to public * add additional test --------- Co-authored-by: EC2 Default User <[email protected]> * feature: support remote debug for sagemaker training job (aws#4315) * feature: support remote debug for sagemaker training job * change: Replace update_remote_config with 2 helper methods for enable and disable respectively * change: add new argument enable_remote_debug to skip set of test_jumpstart_estimator_kwargs_match_parent_class * chore: add jumpstart support for remote debug --------- Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Evan Kravitz <[email protected]> * Update tblib constraint (aws#4317) * Fix: Fix job_objective type (aws#4303) * change: update image_uri_configs 12-21-2023 08:32:41 PST * prepare release v2.202.0 * update development version to v2.202.1.dev0 * Using logging instead of prints (aws#4133) * documentation: update issue template. (aws#4337) * change: update model path in local mode (aws#4296) * Update model path in local mode * Add test * change: update image_uri_configs 12-22-2023 06:17:35 PST * prepare release v2.202.1 * update development version to v2.202.2.dev0 * change: create role if needed in `get_execution_role` (aws#4323) * Create role if needed in get_execution_role * Add tests * Change: More pythonic tags (aws#4327) * Change: More pythonic tags * Fix broken tags * More tags formatting and add a test * Fix tests * Raise Exception for debug (aws#4344) Co-authored-by: Ruilian Gao <[email protected]> * Change: Allow extra_args to be passed to uploader (aws#4338) * Change: Allow extra_args to be passed to uploader * Fix tests * Black * Fix test * Change: Drop py2 tag from the wheel as we don't support Python 2 (aws#4343) * Disable failed test in IR (aws#4345) * Disable failed test in IR * Fix format --------- Co-authored-by: Ruilian Gao <[email protected]> * change: update image_uri_configs 12-25-2023 06:17:33 PST * feat: Supporting tbac in load_run (aws#4039) * feature: support local mode in SageMaker Studio (aws#1300) (aws#4347) * feature: support local mode in SageMaker Studio * chore: fix typo * chore: fix formatting * chore: revert changes for docker compose logs * chore: black-format * change: Use predtermined dns-allow-listed-hostname for Studio Local Support * add support for CodeEditor and JupyterLabs --------- Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> * prepare release v2.203.0 * update development version to v2.203.1.dev0 * change: update image_uri_configs 12-29-2023 06:17:34 PST * query hf api for model md (aws#4346) Co-authored-by: EC2 Default User <[email protected]> * fix: skip failing integs (aws#4348) Co-authored-by: Mufaddal Rohawala <[email protected]> * change: TGI 1.3.3 (aws#4335) * prepare release v2.203.1 * update development version to v2.203.2.dev0 * feat: parallelize notebook search utils, add new operators (aws#4342) * feat: parallelize notebook search utils * chore: raise exception in notebook utils if thread has error * chore: improve variable name * fix: not passing region to get jumpstart bucket * chore: add sagemaker session to notebook utils * chore: address PR comments * feat: add support for includes, begins with, ends with * fix: pylint * feat: private util for model eula key * fix: unit tests, use verify_model_region_and_return_specs in notebook utils * Revert "feat: private util for model eula key" This reverts commit e2daefc. * chore: add search keywords to header * fix: change ConditionNot incorrect property Expression to Condition (aws#4351) * fix: Huggingface glue failing tests (aws#4367) * fix: Huggingface glue failing tests * fix: Sphinx doc build failure * fix: Huggingface glue failing tests * fix: failing sphinx tests * fix: failing sphinx tests * fix: failing black check * fix: sphinx doc errors * fix: sphinx doc errors * sphinx * black-format * sphinx * sphinx * sphinx --------- Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> * fix: Add PyTorch 2.1.0 SM Training DLC to UNSUPPORTED_DLC_IMAGE_FOR_SM_PARALLELISM list (aws#4356) * add 2.1 unsupported smddp * formatting * feat: Support custom repack model settings (aws#4328) * change: update sphinx version (aws#4377) * change: update sphinx version * Update sphinx * change: Updates for DJL 0.26.0 release (aws#4366) * change: TGI NeuronX (aws#4375) * TGI NeuronX * Update * Update * fix: add warning message for job-prefixed pipeline steps when no job name is provided (aws#4371) Co-authored-by: svia3 <[email protected]> * change: JumpStart - TLV region launch (aws#4379) * feat: add throughput management support for feature group (aws#4359) * feat: add throughput management support for feature group * documentation: add doc for feature group throughput config --------- Co-authored-by: Nilesh PS <[email protected]> * change: Enable galactus integ tests (aws#4376) * feat: Enable galactus integ tests * fix flake8 * fix doc8 * trying to see if it works with slow tests * small fixes in import error * fix missing import * try to remove some dependencies from requirement to see if pr test can be fixed * fix flake8 * Enable more tests * Add rerun annotation and further remove dependencies * comment out 2 integ tests * Remove local mode test for now * fix flake8 * prepare release v2.204.0 * update development version to v2.204.1.dev0 * fix: Add validation for empty ParameterString value in start local pipeline (aws#4354) * feat: Support selective pipeline execution for function step (aws#4372) * change: update image_uri_configs 01-24-2024 06:17:33 PST * fix: update get_execution_role_arn from metadata file if present (aws#4388) * fix: Support using PipelineDefinitionConfig in local mode (aws#4352) * fix: remove fastapi and uvicorn dependencies (aws#4365) They are not used in the codebase. Closes aws#4361 aws#4295 * prepare release v2.205.0 * update development version to v2.205.1.dev0 * change: TGI NeuronX 0.0.17 (aws#4390) * fix: Support PipelineVariable for ModelQualityCheckConfig attributes (aws#4353) * feat: Logic to detect hardware GPU count and aggregate GPU memory size in MiB (aws#4389) * Add logic to detect hardware GPU count and aggregate GPU memory size in MiB * Fix all formatting * Addressed PR review comments * Addressed PR Review messages * Addressed PR Review Messages * Addressed PR Review comments * Addressed PR Review Comments * Add integration tests * Add config * Fix integration tests * Include Instance Types GPU infor Config files * Addressed PR review comments * Fix unit tests * Fix unit test: 'Mock' object is not subscriptable --------- Co-authored-by: Jonathan Makunga <[email protected]> * fix: fixed create monitoring schedule failing after validation error (aws#4385) Co-authored-by: Keshav Chandak <[email protected]> * Add collection type support for Feaure Group Ingestion. Add TargetStores support for PutRecord and Ingestion. * Remove merge conflicts. * Update the feature definition type * Black formatting * Fix Flake8 formatting * Fix Pylint * Fix Formatting. --------- Co-authored-by: sagemaker-bot <[email protected]> Co-authored-by: ci <ci> Co-authored-by: qidewenwhen <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: stacicho <[email protected]> Co-authored-by: Teng-xu <[email protected]> Co-authored-by: huilgolr <[email protected]> Co-authored-by: Gary Wang <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: akrishna1995 <[email protected]> Co-authored-by: Miyoung <[email protected]> Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Evan Kravitz <[email protected]> Co-authored-by: martinRenou <[email protected]> Co-authored-by: Duc Trung Le <[email protected]> Co-authored-by: ruiliann666 <[email protected]> Co-authored-by: Ruilian Gao <[email protected]> Co-authored-by: ananth102 <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: amzn-choeric <[email protected]> Co-authored-by: evakravi <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Sirut Buasai <[email protected]> Co-authored-by: Sindhu Somasundaram <[email protected]> Co-authored-by: Stephen Via <[email protected]> Co-authored-by: svia3 <[email protected]> Co-authored-by: Haixin Wang <[email protected]> Co-authored-by: Nilesh PS <[email protected]> Co-authored-by: Nilesh PS <[email protected]> Co-authored-by: jiapinw <[email protected]> Co-authored-by: Jay Goyani <[email protected]> Co-authored-by: Justin <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]>
…ngestion. (aws#4413) * change: update image_uri_configs 12-13-2023 12:23:06 PST * change: update image_uri_configs 12-13-2023 14:04:54 PST * prepare release v2.200.1 * update development version to v2.200.2.dev0 * fix: Move func and args serialization of function step to step level (aws#4312) * fix: Add write permission to job output dirs for remote and step decorator running on non-root job user (aws#4325) * feat: Added update for model package (aws#4309) Co-authored-by: Keshav Chandak <[email protected]> * documentation: fix ModelBuilder sample notebook links (aws#4319) * feat: Use specific images for SMP v2 jobs (aws#4333) * Add check for smp lib * update syntax * Remove unused images * Update repo name and regions * Update account number * Update framework name and check for None distribution * Add unit tests for smp v2 uri * Check enabled * Remove logging * Add cuda version in uri * Update cu121 * Update syntax * Fix black check * Fix black --------- Co-authored-by: huilgolr <[email protected]> * Fix: Updated js mb compression logic - ModelBuilder (aws#4294) Co-authored-by: EC2 Default User <[email protected]> * documentation: SMP v2 doc updates (aws#1423) (aws#4336) * doc update for estimator distribution art * add note to the SMP doc and minor fixes * remove subnodes * rm all v1 content as documenting everything in aws docs * fix build errors * fix white spaces * rm smdistributed from TF estimator distribution * rm white spaces * add notes to TF estimator distribution * fix links * incorporate feedback * update example values * fix version numbers in the notes Co-authored-by: Miyoung <[email protected]> * prepare release v2.201.0 * update development version to v2.201.1.dev0 * Fix: Add additional model builder telemetry (aws#4334) * move telemetry code to public * add additional test --------- Co-authored-by: EC2 Default User <[email protected]> * feature: support remote debug for sagemaker training job (aws#4315) * feature: support remote debug for sagemaker training job * change: Replace update_remote_config with 2 helper methods for enable and disable respectively * change: add new argument enable_remote_debug to skip set of test_jumpstart_estimator_kwargs_match_parent_class * chore: add jumpstart support for remote debug --------- Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Evan Kravitz <[email protected]> * Update tblib constraint (aws#4317) * Fix: Fix job_objective type (aws#4303) * change: update image_uri_configs 12-21-2023 08:32:41 PST * prepare release v2.202.0 * update development version to v2.202.1.dev0 * Using logging instead of prints (aws#4133) * documentation: update issue template. (aws#4337) * change: update model path in local mode (aws#4296) * Update model path in local mode * Add test * change: update image_uri_configs 12-22-2023 06:17:35 PST * prepare release v2.202.1 * update development version to v2.202.2.dev0 * change: create role if needed in `get_execution_role` (aws#4323) * Create role if needed in get_execution_role * Add tests * Change: More pythonic tags (aws#4327) * Change: More pythonic tags * Fix broken tags * More tags formatting and add a test * Fix tests * Raise Exception for debug (aws#4344) Co-authored-by: Ruilian Gao <[email protected]> * Change: Allow extra_args to be passed to uploader (aws#4338) * Change: Allow extra_args to be passed to uploader * Fix tests * Black * Fix test * Change: Drop py2 tag from the wheel as we don't support Python 2 (aws#4343) * Disable failed test in IR (aws#4345) * Disable failed test in IR * Fix format --------- Co-authored-by: Ruilian Gao <[email protected]> * change: update image_uri_configs 12-25-2023 06:17:33 PST * feat: Supporting tbac in load_run (aws#4039) * feature: support local mode in SageMaker Studio (aws#1300) (aws#4347) * feature: support local mode in SageMaker Studio * chore: fix typo * chore: fix formatting * chore: revert changes for docker compose logs * chore: black-format * change: Use predtermined dns-allow-listed-hostname for Studio Local Support * add support for CodeEditor and JupyterLabs --------- Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> * prepare release v2.203.0 * update development version to v2.203.1.dev0 * change: update image_uri_configs 12-29-2023 06:17:34 PST * query hf api for model md (aws#4346) Co-authored-by: EC2 Default User <[email protected]> * fix: skip failing integs (aws#4348) Co-authored-by: Mufaddal Rohawala <[email protected]> * change: TGI 1.3.3 (aws#4335) * prepare release v2.203.1 * update development version to v2.203.2.dev0 * feat: parallelize notebook search utils, add new operators (aws#4342) * feat: parallelize notebook search utils * chore: raise exception in notebook utils if thread has error * chore: improve variable name * fix: not passing region to get jumpstart bucket * chore: add sagemaker session to notebook utils * chore: address PR comments * feat: add support for includes, begins with, ends with * fix: pylint * feat: private util for model eula key * fix: unit tests, use verify_model_region_and_return_specs in notebook utils * Revert "feat: private util for model eula key" This reverts commit e2daefc. * chore: add search keywords to header * fix: change ConditionNot incorrect property Expression to Condition (aws#4351) * fix: Huggingface glue failing tests (aws#4367) * fix: Huggingface glue failing tests * fix: Sphinx doc build failure * fix: Huggingface glue failing tests * fix: failing sphinx tests * fix: failing sphinx tests * fix: failing black check * fix: sphinx doc errors * fix: sphinx doc errors * sphinx * black-format * sphinx * sphinx * sphinx --------- Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> * fix: Add PyTorch 2.1.0 SM Training DLC to UNSUPPORTED_DLC_IMAGE_FOR_SM_PARALLELISM list (aws#4356) * add 2.1 unsupported smddp * formatting * feat: Support custom repack model settings (aws#4328) * change: update sphinx version (aws#4377) * change: update sphinx version * Update sphinx * change: Updates for DJL 0.26.0 release (aws#4366) * change: TGI NeuronX (aws#4375) * TGI NeuronX * Update * Update * fix: add warning message for job-prefixed pipeline steps when no job name is provided (aws#4371) Co-authored-by: svia3 <[email protected]> * change: JumpStart - TLV region launch (aws#4379) * feat: add throughput management support for feature group (aws#4359) * feat: add throughput management support for feature group * documentation: add doc for feature group throughput config --------- Co-authored-by: Nilesh PS <[email protected]> * change: Enable galactus integ tests (aws#4376) * feat: Enable galactus integ tests * fix flake8 * fix doc8 * trying to see if it works with slow tests * small fixes in import error * fix missing import * try to remove some dependencies from requirement to see if pr test can be fixed * fix flake8 * Enable more tests * Add rerun annotation and further remove dependencies * comment out 2 integ tests * Remove local mode test for now * fix flake8 * prepare release v2.204.0 * update development version to v2.204.1.dev0 * fix: Add validation for empty ParameterString value in start local pipeline (aws#4354) * feat: Support selective pipeline execution for function step (aws#4372) * change: update image_uri_configs 01-24-2024 06:17:33 PST * fix: update get_execution_role_arn from metadata file if present (aws#4388) * fix: Support using PipelineDefinitionConfig in local mode (aws#4352) * fix: remove fastapi and uvicorn dependencies (aws#4365) They are not used in the codebase. Closes aws#4361 aws#4295 * prepare release v2.205.0 * update development version to v2.205.1.dev0 * change: TGI NeuronX 0.0.17 (aws#4390) * fix: Support PipelineVariable for ModelQualityCheckConfig attributes (aws#4353) * feat: Logic to detect hardware GPU count and aggregate GPU memory size in MiB (aws#4389) * Add logic to detect hardware GPU count and aggregate GPU memory size in MiB * Fix all formatting * Addressed PR review comments * Addressed PR Review messages * Addressed PR Review Messages * Addressed PR Review comments * Addressed PR Review Comments * Add integration tests * Add config * Fix integration tests * Include Instance Types GPU infor Config files * Addressed PR review comments * Fix unit tests * Fix unit test: 'Mock' object is not subscriptable --------- Co-authored-by: Jonathan Makunga <[email protected]> * fix: fixed create monitoring schedule failing after validation error (aws#4385) Co-authored-by: Keshav Chandak <[email protected]> * Add collection type support for Feaure Group Ingestion. Add TargetStores support for PutRecord and Ingestion. * Remove merge conflicts. * Update the feature definition type * Black formatting * Fix Flake8 formatting * Fix Pylint * Fix Formatting. --------- Co-authored-by: sagemaker-bot <[email protected]> Co-authored-by: ci <ci> Co-authored-by: qidewenwhen <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: Keshav Chandak <[email protected]> Co-authored-by: stacicho <[email protected]> Co-authored-by: Teng-xu <[email protected]> Co-authored-by: huilgolr <[email protected]> Co-authored-by: Gary Wang <[email protected]> Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: akrishna1995 <[email protected]> Co-authored-by: Miyoung <[email protected]> Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Xinyu Xie <[email protected]> Co-authored-by: Evan Kravitz <[email protected]> Co-authored-by: martinRenou <[email protected]> Co-authored-by: Duc Trung Le <[email protected]> Co-authored-by: ruiliann666 <[email protected]> Co-authored-by: Ruilian Gao <[email protected]> Co-authored-by: ananth102 <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Mufaddal Rohawala <[email protected]> Co-authored-by: amzn-choeric <[email protected]> Co-authored-by: evakravi <[email protected]> Co-authored-by: Erick Benitez-Ramos <[email protected]> Co-authored-by: Sirut Buasai <[email protected]> Co-authored-by: Sindhu Somasundaram <[email protected]> Co-authored-by: Stephen Via <[email protected]> Co-authored-by: svia3 <[email protected]> Co-authored-by: Haixin Wang <[email protected]> Co-authored-by: Nilesh PS <[email protected]> Co-authored-by: Nilesh PS <[email protected]> Co-authored-by: jiapinw <[email protected]> Co-authored-by: Jay Goyani <[email protected]> Co-authored-by: Justin <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]> Co-authored-by: Jonathan Makunga <[email protected]>
Issue #, if available:
We've received several feature requests for exposing repack model settings:
Description of changes: Support custom repack model settings
Testing done: Unit tests and validate the create job request.
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.