Skip to content

Passing ParameterString for Estimator output_path fails in Pipeline definition #3104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lkev opened this issue May 12, 2022 · 5 comments
Closed
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@lkev
Copy link

lkev commented May 12, 2022

Describe the bug
Passing a ParameterString parameter in the output arg for Estimator throws an error when defining it as part of a pipeline.

This is because the estimator attempts to parse the ParameterString as an s3 URL and obviously fails.

To reproduce

from sagemaker.workflow.parameters import ParameterString
from sagemaker.workflow.pipeline import Pipeline
from sagemaker.workflow.steps import TrainingStep
from sagemaker.estimator import Estimator

param_training_output_s3_uri = ParameterString(
    name="training_output_s3_uri"
)
est = Estimator(
    role="test",
    image_uri=(
        "763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-"
        "training:2.8.0-gpu-py39-cu112-ubuntu20.04-sagemaker"
    ),
    instance_count=1,
    instance_type="ml.g4dn.xlarge",
    entry_point="training.py",
    source_dir="pipeline/training",
    output_path=param_training_output_s3_uri,
)

est_step = TrainingStep(f"fd_training_step", estimator=est,)

pipe = Pipeline(name="TestPipe", steps=[est_step],)
print(pipe.definition())

Expected behavior
The pipeline definition printed out

Screenshots or logs

The ParameterString gets passed to parse_s3_url(self.output_path) in the Estimator (line 699). However, it can't parse as it as an s3 url as it is a ParameterString object.

See error log below of above code below:

Traceback (most recent call last):
  File "full_pipeline.py", line 348, in <module>
    main()
  File "full_pipeline.py", line 344, in main
    print(pipe.definition())
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py", line 301, in definition
    request_dict = self.to_request()
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/workflow/pipeline.py", line 91, in to_request
    "Steps": list_to_request(self.steps),
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/workflow/utilities.py", line 37, in list_to_request
    request_dicts.append(entity.to_request())
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 314, in to_request
    request_dict = super().to_request()
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 214, in to_request
    step_dict = super().to_request()
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 103, in to_request
    "Arguments": self.arguments,
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/workflow/steps.py", line 298, in arguments
    self.estimator._prepare_for_training()
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/estimator.py", line 659, in _prepare_for_training
    self.uploaded_code = self._stage_user_code_in_s3()
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/estimator.py", line 699, in _stage_user_code_in_s3
    code_bucket, _ = parse_s3_url(self.output_path)
  File "/opt/miniconda3/envs/newenv/lib/python3.9/site-packages/sagemaker/s3.py", line 39, in parse_s3_url
    raise ValueError("Expecting 's3' scheme, got: {} in {}.".format(parsed_url.scheme, url))
ValueError: Expecting 's3' scheme, got:  in .

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.72.3
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
  • Framework version: N/A
  • Python version: 3.9
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
Potentially related to #3091 or #3078.

@lkev lkev added the type: bug label May 12, 2022
@jerrypeng7773 jerrypeng7773 added the component: pipelines Relates to the SageMaker Pipeline Platform label May 12, 2022
@jerrypeng7773
Copy link
Contributor

related #3078

@jerrypeng7773
Copy link
Contributor

@lkev PR 3111 should fix this issue, can you try it out and let us know?

@lkev
Copy link
Author

lkev commented Jun 1, 2022

@lkev PR 3111 should fix this issue, can you try it out and let us know?

So I came across two issues here:

Issue 1

  • I set up a fresh environment, python 3.9.0 and installed sagemaker from this PR using pip install . (sm version is 2.91.2.dev0)
  • ran import sagemaker and got the following error - this is probably a more requirements issue unrelated to this specific issue.
Traceback (most recent call last):
  File "/smtest.py", line 1, in <module>
    import sagemaker
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/__init__.py", line 18, in <module>
    from sagemaker import estimator, parameter, tuner  # noqa: F401
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/estimator.py", line 27, in <module>
    from sagemaker import git_utils, image_uris, vpc_utils
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/image_uris.py", line 24, in <module>
    from sagemaker.spark import defaults
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/spark/__init__.py", line 16, in <module>
    from sagemaker.spark.processing import PySparkProcessor, SparkJarProcessor  # noqa: F401
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/spark/processing.py", line 35, in <module>
    from sagemaker.local.image import _ecr_login_if_needed, _pull_image
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/local/__init__.py", line 16, in <module>
    from .local_session import (  # noqa: F401
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/local/local_session.py", line 23, in <module>
    from sagemaker.local.image import _SageMakerContainer
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/local/image.py", line 38, in <module>
    import sagemaker.local.data
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/local/data.py", line 26, in <module>
    import sagemaker.amazon.common
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/amazon/common.py", line 23, in <module>
    from sagemaker.amazon.record_pb2 import Record
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/sagemaker/amazon/record_pb2.py", line 36, in <module>
    _descriptor.FieldDescriptor(
  File "/opt/miniconda3/envs/smenv2/lib/python3.9/site-packages/google/protobuf/descriptor.py", line 560, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Issue 2

  • downgraded protobuf to protobuf-3.19.4
  • The code described above now runs, but uploads the source_dir target at the point of definition instead of when I run the pipeline, i.e. in my example above uploads the contents of pipeline/training/training.py to <<default-sagemaker-bucket>>/fd_training_step-f9c929e2d15bf5836d4776825737d89b/source/sourcedir.tar.gz when I run the above code, before ever creating a pipeline execution
  • If I change the line
param_training_output_s3_uri = ParameterString(
    name="training_output_s3_uri"
)

to

param_training_output_s3_uri = ParameterString(
    name="training_output_s3_uri",
    default_value="s3://my-bucket/my-path/"
)

, it ignores this path and uploads to the regular default anyway (again in my case tries to upload to <<default-sagemaker-bucket>>/fd_training_step-f9c929e2d15bf5836d4776825737d89b/source/sourcedir.tar.gz)

@lkev
Copy link
Author

lkev commented Jun 1, 2022

potentially related #3149

@jerrypeng7773
Copy link
Contributor

jerrypeng7773 commented Jun 1, 2022

@lkev if you do have default value, you can just hard code and pass it to the code_location parameter of the estimator, then, your sources will be uploaded to that particular s3 location. The reason is that code uploading is done at compile time, but the pipeline parameter won't be evaluated until run time (pipeline execution).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
Development

No branches or pull requests

3 participants