Skip to content

TrainingStep changing sagemaker_job_name input parameter with each pipeline update [kills caching] #2940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mpaf opened this issue Feb 16, 2022 · 1 comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@mpaf
Copy link

mpaf commented Feb 16, 2022

Describe the bug
During pipeline upsert, a training job name is created and added to the input parameters of a sagemaker training step which kills caching (even when otherwise nothing has changed). The input parameter in question is in hyperparameters: sagemaker_job_name. This job_name is modified each time we run upsert, even with no change.

To reproduce

  1. Create a Pipeline with a single train step
  2. Run > pipeline.steps[0].estimator._hyperparameters['sagemaker_job_name']
  3. It will throw an error as that hyperparam doesn't exist
  4. run pipeline.definition() or pipeline.upsert()
  5. run step 2 again, you now have a job_name input param (not the actual job name that pipeline will run)
  6. re-running steps 4 and 5 always gives a different output, even with no changes to pipeline.
  7. This kills the pipeline cache with no-op updates to the Training Step.

Expected behavior

The training job name is a dynamic parameter to be created during pipeline execution, and not during pipeline creation/update time.

Screenshots or logs
NA

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.75.1
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch Framework Estimator
  • Framework version: 1.9.0
  • Python version: 3.7.1
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
N/A

@mpaf mpaf added the type: bug label Feb 16, 2022
@aoguo64 aoguo64 added the component: pipelines Relates to the SageMaker Pipeline Platform label Feb 17, 2022
@staubhp
Copy link
Contributor

staubhp commented Feb 22, 2022

Thanks for pointing this out. The above PR should address this.

Also note that frameworks run in script mode will insert other dynamic hyperparameters (model_dir, sagemaker_submit_directory) which will break caching. Users need to set model_dir=False and assign the source_dir attribute on the estimator to fix each one respectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
Development

No branches or pull requests

3 participants