Skip to content

Passing instance type as a ParameterString to PyTorch estimator in 2.163.0 crashes #3917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pviolette3 opened this issue Jun 7, 2023 · 6 comments
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@pviolette3
Copy link

pviolette3 commented Jun 7, 2023

Describe the bug
Passing instance type to PyTorch estimator that is a ParameterString causes crashes

To reproduce
Pass a ParameterString as instance_type to the estimator in pipeline mode to run the pipeline with different instance types gives:

TypeError: Pipeline variables do not support str operation. Please use .to_string() to convert it to string type in execution timeor use .expr to translate it to Json for display purpose in Python SDK.

Expected behavior
Expected that it is ok to pass instance type as a pipeline parameter

Screenshots or logs

/root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/_pytest/fixtures.py:909:
--
497 | _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
498 | e2e_model_building/test/acceptance/test_e2e_model_building.py:129: in _
499 | pipeline.create(fake_sagemaker_config.execution_role)
500 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py:147: in create
501 | kwargs = self._create_args(role_arn, description, parallelism_config)
502 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py:169: in _create_args
503 | pipeline_definition = self.definition()
504 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py:366: in definition
505 | request_dict = self.to_request()
506 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py:108: in to_request
507 | "Steps": build_steps(self.steps, self.name),
508 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/utilities.py:100: in build_steps
509 | request_dicts.append(step.to_request())
510 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/steps.py:508: in to_request
511 | request_dict = super().to_request()
512 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/steps.py:352: in to_request
513 | step_dict = super().to_request()
514 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/steps.py:121: in to_request
515 | "Arguments": self.arguments,
516 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/steps.py:481: in arguments
517 | execute_job_functions(self.step_args)
518 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/utilities.py:408: in execute_job_functions
519 | chained_args = step_args.func(*step_args.func_args, **step_args.func_kwargs)
520 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/estimator.py:1243: in fit
521 | self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config)
522 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/estimator.py:2179: in start_new
523 | estimator.sagemaker_session.train(**train_args)
524 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/session.py:822: in train
525 | and not instance_supports_kms(inferred_resource_config["InstanceType"])
526 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/utils.py:1409: in instance_supports_kms
527 | return volume_size_supported(instance_type)
528 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/utils.py:1400: in volume_size_supported
529 | raise ValueError(f"Failed to parse instance type '{instance_type}': {str(e)}")
530 | _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
531 |  
532 | self = ParameterString(name='TrainInstanceType', parameter_type=<ParameterTypeEnum.STRING: 'String'>, default_value='ml.p3.8xlarge')
533 |  
534 | def __str__(self):
535 | """Override built-in String function for PipelineVariable"""
536 | >       raise TypeError(
537 | "Pipeline variables do not support __str__ operation. "
538 | "Please use `.to_string()` to convert it to string type in execution time"
539 | "or use `.expr` to translate it to Json for display purpose in Python SDK."
540 | )
541 | E       TypeError: Pipeline variables do not support __str__ operation. Please use `.to_string()` to convert it to string type in execution timeor use `.expr` to translate it to Json for display purpose in Python SDK.
542 |  
543 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/workflow/entities.py:86: TypeError
544 | ----------------------------- Captured stdout call -----------------------------

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: sagemaker-2.163.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch
  • Framework version: 1.13
  • Python version: 3.9
  • CPU or GPU: CPU
  • Custom Docker image (Y/N):

Additional context
Suspect #3870 affected this since it recently changed this code to provide better validation for users trying to set volume size.

This change also recently started failing in the CI on sagemaker-2.163.0

Also using moto 4.1.11 for the unit test - perhaps some issues there too.

@pviolette3
Copy link
Author

pviolette3 commented Jun 7, 2023

I'm also using Moto 4.1.11 in these tests - so it is possible that is breaking something. Still trying to debug

Collecting sagemaker
--
30 | Downloading sagemaker-2.163.0.tar.gz (800 kB)
31 | Preparing metadata (setup.py): started
32 | Preparing metadata (setup.py): finished with status 'done'
33 | Collecting moto
34 | Downloading moto-4.1.11-py2.py3-none-any.whl (3.0 MB)

I'm also seeing:

instance_type = ParameterString(name='TrainInstanceType', parameter_type=<ParameterTypeEnum.STRING: 'String'>, default_value='ml.p3.8xlarge')
--
266 |  
267 | def volume_size_supported(instance_type: str) -> bool:
268 | """Returns True if SageMaker allows volume_size to be used for the instance type.
269 |  
270 | Raises:
271 | ValueError: If the instance type is improperly formatted.
272 | """
273 |  
274 | try:
275 |  
276 | # local mode does not support volume size
277 | >           if instance_type.startswith("local"):
278 | E           AttributeError: 'ParameterString' object has no attribute 'startswith'
279 |  
280 | /root/.pyenv/versions/3.9.16/lib/python3.9/site-packages/sagemaker/utils.py:1384: AttributeError

@pviolette3 pviolette3 changed the title Passing instance type as a ParameterString in 2.163.0 crashes Passing instance type as a ParameterString to PyTorch estimator in 2.163.0 crashes Jun 7, 2023
@pviolette3
Copy link
Author

Sagemaker version 2.132.0 is working for us.

sagemaker==2.132.0

@jerrypeng7773 jerrypeng7773 added the component: pipelines Relates to the SageMaker Pipeline Platform label Jun 19, 2023
@jerrypeng7773
Copy link
Contributor

@pviolette3 your suspicion is correct, this is a regression introduced by #3870. Let me pr the fix. Also, pipeline team are working on some systematic compatibility tests to prevent such issue occur again. However, this requires some significant efforts.

@jerrypeng7773
Copy link
Contributor

We have a pr to fix this issue #3972

@brockwade633
Copy link
Contributor

The fix has been merged, @pviolette3 are you able to verify expected behavior?

@qidewenwhen
Copy link
Member

Closing the issue as no response for a month. Feel free to reopen if you have any other questions. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
Development

No branches or pull requests

4 participants