Skip to content

Error when Estimator instance_type is a ParameterString #3968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fulcus opened this issue Jun 29, 2023 · 1 comment · Fixed by #3972
Closed

Error when Estimator instance_type is a ParameterString #3968

fulcus opened this issue Jun 29, 2023 · 1 comment · Fixed by #3972
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@fulcus
Copy link

fulcus commented Jun 29, 2023

Describe the bug

Passing instance_type of type ParameterString causes an error, although it should be supported as it is a subclass of PipelineVariable, as per the docs.
Instead when passing a string it works just fine.

To reproduce

train_instance_type_param = ParameterString(
    name="TrainingInstanceType",
    default_value="ml.m5.xlarge",
)

estimator = sagemaker.estimator.Estimator(
    image_uri=xgboost_image_uri,
    role=sm_role, 
    instance_type=train_instance_type_param, # CAUSE OF THE BUG
    instance_count=train_instance_count_param,
    output_path=output_s3_url,
    sagemaker_session=session,
    base_job_name=f"{pipeline_name}/train",
)
Throws error
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/opt/conda/lib/python3.7/site-packages/sagemaker/utils.py in volume_size_supported(instance_type)
   1405         # local mode does not support volume size
-> 1406         if instance_type.startswith("local"):
   1407             return False

AttributeError: 'ParameterString' object has no attribute 'startswith'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-39-6027fab2ad80> in <module>
      1 # Create a new or update existing Pipeline
----> 2 pipeline.upsert(role_arn=sm_role)

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline.py in upsert(self, role_arn, description, tags, parallelism_config)
    276             raise ValueError("An AWS IAM role is required to create or update a Pipeline.")
    277         try:
--> 278             response = self.create(role_arn, description, tags, parallelism_config)
    279         except ClientError as ce:
    280             error_code = ce.response["Error"]["Code"]

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline.py in create(self, role_arn, description, tags, parallelism_config)
    145         tags = _append_project_tags(tags)
    146         tags = self.sagemaker_session._append_sagemaker_config_tags(tags, PIPELINE_TAGS_PATH)
--> 147         kwargs = self._create_args(role_arn, description, parallelism_config)
    148         update_args(
    149             kwargs,

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline.py in _create_args(self, role_arn, description, parallelism_config)
    167             A keyword argument dict for calling create_pipeline.
    168         """
--> 169         pipeline_definition = self.definition()
    170         kwargs = dict(
    171             PipelineName=self.name,

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline.py in definition(self)
    364     def definition(self) -> str:
    365         """Converts a request structure to string representation for workflow service calls."""
--> 366         request_dict = self.to_request()
    367         self._interpolate_step_collection_name_in_depends_on(request_dict["Steps"])
    368         request_dict["PipelineExperimentConfig"] = interpolate(

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline.py in to_request(self)
    106             if self.pipeline_experiment_config is not None
    107             else None,
--> 108             "Steps": build_steps(self.steps, self.name),
    109         }
    110 

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/utilities.py in build_steps(steps, pipeline_name)
     98                 pipeline_name, step.name, get_code_hash(step), get_config_hash(step)
     99             ):
--> 100                 request_dicts.append(step.to_request())
    101     return request_dicts
    102 

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/steps.py in to_request(self)
    506     def to_request(self) -> RequestType:
    507         """Updates the request dictionary with cache configuration."""
--> 508         request_dict = super().to_request()
    509         if self.cache_config:
    510             request_dict.update(self.cache_config.config)

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/steps.py in to_request(self)
    350     def to_request(self) -> RequestType:
    351         """Gets the request structure for `ConfigurableRetryStep`."""
--> 352         step_dict = super().to_request()
    353         if self.retry_policies:
    354             step_dict["RetryPolicies"] = self._resolve_retry_policy(self.retry_policies)

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/steps.py in to_request(self)
    119             "Name": self.name,
    120             "Type": self.step_type.value,
--> 121             "Arguments": self.arguments,
    122         }
    123         if self.depends_on:

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/steps.py in arguments(self)
    479             # execute fit function with saved parameters,
    480             # and store args in PipelineSession's _context
--> 481             execute_job_functions(self.step_args)
    482 
    483             # populate request dict with args

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/utilities.py in execute_job_functions(step_args)
    406     """
    407 
--> 408     chained_args = step_args.func(*step_args.func_args, **step_args.func_kwargs)
    409     if isinstance(chained_args, _StepArguments):
    410         execute_job_functions(chained_args)

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in fit(self, inputs, wait, logs, job_name, experiment_config)
   1267 
   1268         experiment_config = check_and_get_run_experiment_config(experiment_config)
-> 1269         self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config)
   1270         self.jobs.append(self.latest_training_job)
   1271         if wait:

/opt/conda/lib/python3.7/site-packages/sagemaker/estimator.py in start_new(cls, estimator, inputs, experiment_config)
   2213         train_args = cls._get_train_args(estimator, inputs, experiment_config)
   2214 
-> 2215         estimator.sagemaker_session.train(**train_args)
   2216 
   2217         return cls(estimator.sagemaker_session, estimator._current_job_name)

/opt/conda/lib/python3.7/site-packages/sagemaker/session.py in train(self, input_mode, input_config, role, job_name, output_config, resource_config, vpc_config, hyperparameters, stop_condition, tags, metric_definitions, enable_network_isolation, image_uri, training_image_config, container_entry_point, container_arguments, algorithm_arn, encrypt_inter_container_traffic, use_spot_instances, checkpoint_s3_uri, checkpoint_local_path, experiment_config, debugger_rule_configs, debugger_hook_config, tensorboard_output_config, enable_sagemaker_metrics, profiler_rule_configs, profiler_config, environment, retry_strategy)
    841             not customer_supplied_kms_key
    842             and "InstanceType" in inferred_resource_config
--> 843             and not instance_supports_kms(inferred_resource_config["InstanceType"])
    844             and "VolumeKmsKeyId" in inferred_resource_config
    845         ):

/opt/conda/lib/python3.7/site-packages/sagemaker/utils.py in instance_supports_kms(instance_type)
   1429         ValueError: If the instance type is improperly formatted.
   1430     """
-> 1431     return volume_size_supported(instance_type)

/opt/conda/lib/python3.7/site-packages/sagemaker/utils.py in volume_size_supported(instance_type)
   1420         return "d" not in family and not family.startswith("g5")
   1421     except Exception as e:
-> 1422         raise ValueError(f"Failed to parse instance type '{instance_type}': {str(e)}")
   1423 
   1424 

/opt/conda/lib/python3.7/site-packages/sagemaker/workflow/entities.py in __str__(self)
     85         """Override built-in String function for PipelineVariable"""
     86         raise TypeError(
---> 87             "Pipeline variables do not support __str__ operation. "
     88             "Please use `.to_string()` to convert it to string type in execution time"
     89             "or use `.expr` to translate it to Json for display purpose in Python SDK."

TypeError: Pipeline variables do not support __str__ operation. Please use `.to_string()` to convert it to string type in execution timeor use `.expr` to translate it to Json for display purpose in Python SDK.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.168.0
  • Python version: 3.7.10
@jerrypeng7773
Copy link
Contributor

We have a pr to fix this issue #3972

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
2 participants