Skip to content

fix: Set ProcessingStep upload locations deterministically to avoid c… #2790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 8, 2021
Merged

fix: Set ProcessingStep upload locations deterministically to avoid c… #2790

merged 6 commits into from
Dec 8, 2021

Conversation

staubhp
Copy link
Contributor

@staubhp staubhp commented Dec 8, 2021

…ache misses on pipeline upsert. Add a warning to cache-enabled TrainingSteps with profiling enabled

Issue #, if available:
#2736

Description of changes:
This PR addresses two areas where dynamic elements like timestamps might be inserted into pipeline definitions. These dynamic elements are a problem because they will cause cache misses when the pipeline is upserted, even if the step itself didn't change.

The first, in ProcessingStep, happens because the Processor class will upload local code entry points to an S3 URI containing a timestamp. With this PR, that S3 URI will be overridden with a location made up of the step name + the MD5 hash of the script contents.

The second, in TrainingStep, happens because the Estimator class enables profiling by default, and the default profiler rule contains a timestamp. This PR adds a warning when instantiating cache-enabled TrainingSteps that have profiling enabled.

Testing done:
Manual, unit test

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I certify that the changes I am introducing will be backword compatible, and I have discussed concerns about this, if any, with the Python SDK team
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

…ache misses on pipeline upsert. Add a warning to cache-enabled TrainingSteps with profiling enabled
@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: df2143f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 636f74b
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-local-mode-tests
  • Commit ID: 636f74b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-slow-tests
  • Commit ID: 636f74b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

aoguo64
aoguo64 previously approved these changes Dec 8, 2021
dpatro
dpatro previously approved these changes Dec 8, 2021
qidewenwhen
qidewenwhen previously approved these changes Dec 8, 2021
jeniyat
jeniyat previously approved these changes Dec 8, 2021
Copy link
Contributor

@jeniyat jeniyat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ahsan-z-khan ahsan-z-khan changed the base branch from master to dev December 8, 2021 22:29
@ahsan-z-khan ahsan-z-khan dismissed stale reviews from jeniyat, qidewenwhen, dpatro, and aoguo64 December 8, 2021 22:29

The base branch was changed.

@sagemaker-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-unit-tests
  • Commit ID: 13ca522
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@jeniyat jeniyat merged commit ad24a9a into aws:dev Dec 8, 2021
ahsan-z-khan added a commit that referenced this pull request Dec 30, 2021
* update smdmp change log, archive api doc for 1.4.0 and 1.5.0

* add no-index flags

* finish api doc archive

* fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790)

* fix: Prevent repack_model script from referencing nonexistent directories (#2755)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>

* fix: S3Input - add support for instance attributes (#2754)

* fix: typos and broken link (#2765)

Co-authored-by: Shreya Pandit <[email protected]>

* add all api docs

* add appendix, fix links

* structural changes, fix links

* incorporate feedback

* prepare release v2.72.1

* update development version to v2.72.2.dev0

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>
mufaddal-rohawala added a commit that referenced this pull request Jan 4, 2022
* update smdmp change log, archive api doc for 1.4.0 and 1.5.0

* add no-index flags

* finish api doc archive

* fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790)

* fix: Prevent repack_model script from referencing nonexistent directories (#2755)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>

* fix: S3Input - add support for instance attributes (#2754)

* fix: typos and broken link (#2765)

Co-authored-by: Shreya Pandit <[email protected]>

* add all api docs

* add appendix, fix links

* structural changes, fix links

* incorporate feedback

* prepare release v2.72.1

* update development version to v2.72.2.dev0

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>
mufaddal-rohawala added a commit that referenced this pull request Jan 4, 2022
* update smdmp change log, archive api doc for 1.4.0 and 1.5.0

* add no-index flags

* finish api doc archive

* fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790)

* fix: Prevent repack_model script from referencing nonexistent directories (#2755)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>

* fix: S3Input - add support for instance attributes (#2754)

* fix: typos and broken link (#2765)

Co-authored-by: Shreya Pandit <[email protected]>

* add all api docs

* add appendix, fix links

* structural changes, fix links

* incorporate feedback

* prepare release v2.72.1

* update development version to v2.72.2.dev0

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>
shreyapandit added a commit that referenced this pull request Jan 11, 2022
* update smdmp change log, archive api doc for 1.4.0 and 1.5.0

* add no-index flags

* finish api doc archive

* fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790)

* fix: Prevent repack_model script from referencing nonexistent directories (#2755)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>

* fix: S3Input - add support for instance attributes (#2754)

* fix: typos and broken link (#2765)

Co-authored-by: Shreya Pandit <[email protected]>

* add all api docs

* add appendix, fix links

* structural changes, fix links

* incorporate feedback

* prepare release v2.72.1

* update development version to v2.72.2.dev0

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>
EthanShouhanCheng pushed a commit to SissiChenxy/sagemaker-python-sdk that referenced this pull request Jan 11, 2022
EthanShouhanCheng pushed a commit to SissiChenxy/sagemaker-python-sdk that referenced this pull request Jan 11, 2022
* update smdmp change log, archive api doc for 1.4.0 and 1.5.0

* add no-index flags

* finish api doc archive

* fix: Set ProcessingStep upload locations deterministically to avoid c… (aws#2790)

* fix: Prevent repack_model script from referencing nonexistent directories (aws#2755)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>

* fix: S3Input - add support for instance attributes (aws#2754)

* fix: typos and broken link (aws#2765)

Co-authored-by: Shreya Pandit <[email protected]>

* add all api docs

* add appendix, fix links

* structural changes, fix links

* incorporate feedback

* prepare release v2.72.1

* update development version to v2.72.2.dev0

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>
shreyapandit added a commit that referenced this pull request Feb 3, 2022
* feature: allow conditional parellel builds (#2727)

* fix endpoint bug (#2772)

Co-authored-by: Basil Beirouti <[email protected]>

* fix: local mode - support relative file structure (#2768)

* prepare release v2.72.0

* update development version to v2.72.1.dev0

* fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790)

* fix: Prevent repack_model script from referencing nonexistent directories (#2755)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>

* fix: S3Input - add support for instance attributes (#2754)

* fix: typos and broken link (#2765)

Co-authored-by: Shreya Pandit <[email protected]>

* prepare release v2.72.1

* update development version to v2.72.2.dev0

* fix: Model Registration with BYO scripts (#2797)

Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>

* fix: Add ContentType in test_auto_ml_describe

* fix: Re-deploy static integ test endpoint if it is not found

* documentation :SageMaker model parallel library 1.6.0 API doc (#2814)

* update smdmp change log, archive api doc for 1.4.0 and 1.5.0

* add no-index flags

* finish api doc archive

* fix: Set ProcessingStep upload locations deterministically to avoid c… (#2790)

* fix: Prevent repack_model script from referencing nonexistent directories (#2755)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>

* fix: S3Input - add support for instance attributes (#2754)

* fix: typos and broken link (#2765)

Co-authored-by: Shreya Pandit <[email protected]>

* add all api docs

* add appendix, fix links

* structural changes, fix links

* incorporate feedback

* prepare release v2.72.1

* update development version to v2.72.2.dev0

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>

* fix: fix kmeans test deletion sequence, increment lineage statics (#2815)

* fix: Increment static lineage pipeline (#2817)

* fix: Update CHANGELOG.md (#2832)

* prepare release v2.72.2

* update development version to v2.72.3.dev0

* change: update master from dev (#2836)

Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>
Co-authored-by: sreedes <[email protected]>
Co-authored-by: Navin Soni <[email protected]>
Co-authored-by: Miyoung <[email protected]>
Co-authored-by: Ameen Khan <[email protected]>
Co-authored-by: Zhankui Lu <[email protected]>
Co-authored-by: Xiaoguang Chen <[email protected]>
Co-authored-by: Jonathan Guinegagne <[email protected]>
Co-authored-by: Zhankui Lu <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
Co-authored-by: Qingzi-Lan <[email protected]>

* prepare release v2.72.3

* update development version to v2.72.4.dev0

* fix: fixes unnecessary session call while generating pipeline definition for lambda step (#2824)

* feature: Add models_v2 under lineage context (#2800)

* feature: enable python 3.9 (#2802)

Co-authored-by: Ahsan Khan <[email protected]>

* change: Update CHANGELOG.md (#2842)

* fix: update pricing link (#2805)

Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Jeniya Tabassum <[email protected]>
Co-authored-by: sreedes <[email protected]>
Co-authored-by: Navin Soni <[email protected]>
Co-authored-by: Miyoung <[email protected]>
Co-authored-by: Ameen Khan <[email protected]>
Co-authored-by: Zhankui Lu <[email protected]>
Co-authored-by: Navin Soni <[email protected]>
Co-authored-by: Xiaoguang Chen <[email protected]>
Co-authored-by: Jonathan Guinegagne <[email protected]>
Co-authored-by: Zhankui Lu <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
Co-authored-by: Qingzi-Lan <[email protected]>

* doc: Document the available ExecutionVariables (#2807)

* fix: Remove duplicate vertex/edge in query lineage (#2784)

* feature: Support model pipelines in CreateModelStep (#2845)

Co-authored-by: Payton Staub <[email protected]>

* feature: support JsonGet/Join parameterization in tuning step Hyperparameters (#2833)

* doc: Enhance smddp 1.2.2 doc (#2852)

* feature: support checkpoint to be passed from estimator (#2849)

Co-authored-by: marckarp <[email protected]>

* fix: allow kms_key to be passed for processing step (#2779)

* feature: Adds support for Serverless inference (#2831)

* feature: Add support for SageMaker lineage queries in action (#2853)

* feature: Adds Lineage queries in artifact, context and trial components (#2838)

* feature: Add EMRStep support in Sagemaker pipeline (#2848)

Co-authored-by: chenxy <[email protected]>

* prepare release v2.73.0

* update development version to v2.73.1.dev0

* feature: Add support for SageMaker lineage queries context (#2830)

* fix: support specifying a facet by its column index

Currently the Clarify BiasConfig only accepts facet name. Actually
Clarify analysis configuration supports both name and index. This
commit adds the same support to BiasConfig.

* doc: more documentation for serverless inference (#2859)

* prepare release v2.74.0

* update development version to v2.74.1.dev0

* Add deprecation warning in Clarify DataConfig (#2847)

* feature: Update instance types for integ test (#2881)

* feature: Adds support for async inference (#2846)

* fix: update to incorporate black v22, pin tox versions (#2889)

Co-authored-by: Mufaddal Rohawala <[email protected]>

* make black happy

Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: Basil Beirouti <[email protected]>
Co-authored-by: ci <ci>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Payton Staub <[email protected]>
Co-authored-by: Ahsan Khan <[email protected]>
Co-authored-by: Mohamed Ali Jamaoui <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Co-authored-by: sreedes <[email protected]>
Co-authored-by: Navin Soni <[email protected]>
Co-authored-by: Miyoung <[email protected]>
Co-authored-by: Jeniya Tabassum <[email protected]>
Co-authored-by: Ameen Khan <[email protected]>
Co-authored-by: Zhankui Lu <[email protected]>
Co-authored-by: Xiaoguang Chen <[email protected]>
Co-authored-by: Jonathan Guinegagne <[email protected]>
Co-authored-by: Zhankui Lu <[email protected]>
Co-authored-by: Yifei Zhu <[email protected]>
Co-authored-by: Qingzi-Lan <[email protected]>
Co-authored-by: Xinghan Chen <[email protected]>
Co-authored-by: Navin Soni <[email protected]>
Co-authored-by: Tulio Casagrande <[email protected]>
Co-authored-by: jerrypeng7773 <[email protected]>
Co-authored-by: marckarp <[email protected]>
Co-authored-by: marckarp <[email protected]>
Co-authored-by: jayatalr <[email protected]>
Co-authored-by: bhaoz <[email protected]>
Co-authored-by: Ethan Cheng <[email protected]>
Co-authored-by: chenxy <[email protected]>
Co-authored-by: Xiaoguang Chen <[email protected]>
Co-authored-by: keerthanvasist <[email protected]>
Co-authored-by: Mufaddal Rohawala <[email protected]>
Co-authored-by: Shreya Pandit <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants