Skip to content

Passing processor image uri as a ParameterString fails #3152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lkev opened this issue Jun 1, 2022 · 4 comments
Closed

Passing processor image uri as a ParameterString fails #3152

lkev opened this issue Jun 1, 2022 · 4 comments
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@lkev
Copy link

lkev commented Jun 1, 2022

Describe the bug

When passing a ParameterString object as the image_uri to Processor, it fails on pipeline definition as it's trying to parse and verify the image uri as a string, but it's a ParameterString, so fails.

important note This fails on the latest version, 2.92.2, but not on 2.91.2.dev0 (the version installed from #3111 ).

To reproduce

import sagemaker
from sagemaker.workflow.parameters import ParameterString
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.steps import ProcessingStep
from sagemaker.processing import Processor

print(sagemaker.__version__)

param_estimator_image_uri = ParameterString(
    name="estimator_image_uri",
)
proc = Processor(
    image_uri=param_estimator_image_uri,
    role="arn:aws:iam::942338063951:role/datascience-sagemaker",
    instance_count=1,
    instance_type="ml.g4dn.xlarge",
)

proc_step= ProcessingStep("proc_step", processor=proc)

pipe = Pipeline(name="TestPipe", steps=[proc_step])
print(pipe.definition())

Expected behavior

Expect following output:

2.92.2
{"Version": "2020-12-01", "Metadata": {}, "Parameters": [], "PipelineExperimentConfig": {"ExperimentName": {"Get": "Execution.PipelineName"}, "TrialName": {"Get": "Execution.PipelineExecutionId"}}, "Steps": [{"Name": "proc_step", "Type": "Processing", "Arguments": {"ProcessingResources": {"ClusterConfig": {"InstanceType": "ml.g4dn.xlarge", "InstanceCount": 1, "VolumeSizeInGB": 30}}, "AppSpecification": {"ImageUri": {"Get": "Parameters.estimator_image_uri"}}, "RoleArn": "arn:aws:iam::942338063951:role/datascience-sagemaker"}}]}

Screenshots or logs

Get the following error:

2.92.2
Traceback (most recent call last):
  File "/Users/nh105218/repos/fraud_detector/smtest.py", line 22, in <module>
    print(pipe.definition())
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py", line 292, in definition
    request_dict = self.to_request()
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py", line 91, in to_request
    "Steps": list_to_request(self.steps),
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/workflow/utilities.py", line 43, in list_to_request
    request_dicts.append(entity.to_request())
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 754, in to_request
    request_dict = super(ProcessingStep, self).to_request()
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 226, in to_request
    step_dict = super().to_request()
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 109, in to_request
    "Arguments": self.arguments,
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 730, in arguments
    normalized_inputs, normalized_outputs = self.processor._normalize_args(
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/processing.py", line 240, in _normalize_args
    self._current_job_name = self._generate_current_job_name(job_name=job_name)
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/processing.py", line 283, in _generate_current_job_name
    base_name = base_name_from_image(self.image_uri)
  File "/opt/miniconda3/envs/env2/lib/python3.9/site-packages/sagemaker/utils.py", line 102, in base_name_from_image
    m = re.match("^(.+/)?([^:/]+)(:[^:]+)?$", image)
  File "/opt/miniconda3/envs/env2/lib/python3.9/re.py", line 191, in match
    return _compile(pattern, flags).match(string)
TypeError: expected string or bytes-like object

When debugging, the value I get for image being passed to base_name_from_image(image) is ParameterString(name='estimator_image_uri', parameter_type=<ParameterTypeEnum.STRING: 'String'>, default_value=None)

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.92.2
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
  • Framework version: N/A
  • Python version: 3.9.6
  • CPU or GPU: N/A
  • Custom Docker image (Y/N): N
@lkev lkev added the type: bug label Jun 1, 2022
@qidewenwhen qidewenwhen added the component: pipelines Relates to the SageMaker Pipeline Platform label Jun 1, 2022
@qidewenwhen
Copy link
Member

qidewenwhen commented Jun 1, 2022

Hi @lkev, thanks for helping on debugging and providing all these detailed description. You're right, the parameterized image_uri is not properly handled.
To bypass this issue, could you pass in the job_name using the new ProcesingStep interface? see sample below (It worked on my side):

    param_estimator_image_uri = ParameterString(
        name="estimator_image_uri",
    )
    proc = Processor(
        image_uri=param_estimator_image_uri,
        role="arn:aws:iam::942338063951:role/datascience-sagemaker",
        instance_count=1,
        instance_type="ml.g4dn.xlarge",
        sagemaker_session=pipeline_session,
    )
    step_args = proc.run(
        job_name="job_name",
    )
    proc_step = ProcessingStep("proc_step", step_args=step_args)

    pipe = Pipeline(name="TestPipe", steps=[proc_step])
    print(pipe.definition())

We will open PR to fix this issue

@qidewenwhen
Copy link
Member

The PR to fix this issue: #3158

@qidewenwhen
Copy link
Member

The PR has been merged recently. Will be released within a week.

@qidewenwhen
Copy link
Member

Hi, the fix has been released in v2.100.0: https://github.com/aws/sagemaker-python-sdk/releases/tag/v2.100.0
Can you confirm if it works for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
Development

No branches or pull requests

2 participants