Skip to content

TensorFlowProcessor tries to run python script using /bin/bash as its entrypoint #4028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
svpino opened this issue Jul 26, 2023 · 2 comments
Closed

Comments

@svpino
Copy link

svpino commented Jul 26, 2023

SageMaker Python SDK version: 2.173.0

I have a SageMaker Pipeline with one Processing Step that evaluates a model. The python script that does the evaluation requires TensorFlow so I'm using the TensorFlowProcessor class. Here is how I'm creating an instance of this class:

tensorflow_processor = TensorFlowProcessor(
    base_job_name="evaluation-processor",
    framework_version="2.6",
    py_version="py38",
    instance_type="ml.m5.large",
    instance_count=1,
    role=role
)

Here is the definition of the ProcessingStep that uses the processor above:

step = ProcessingStep(
    name="evaluate-model",
    processor=tensorflow_processor,
    inputs=[
        ...
    ],
    outputs=[
        ...
    ],
    code="evaluation.py"
)

Notice how I'm specifying a Python script (evaluation.py) and I'm also telling the processor that the Python version should be 3.8. Despite this, the Processing Job tries to execute the script using /bin/bash evaluation.py instead of python3 evaluation.py.

Workaround

I can set the framework_entrypoint_command attribute directly and that solves the problem:

tensorflow_processor.framework_entrypoint_command = ["python3"]
@athewsey
Copy link
Collaborator

athewsey commented Jul 28, 2023

Haven't dived deep on this issue but dropping by with an FYI because a misunderstanding on a previous FrameworkProcessor+Pipelines issue when FrameworkProcessor first launched caused some delay and re-work:

The FrameworkProcessor delivers estimator-like source_dir functionality by creating a shell script to extract your source bundle and install requirements.txt if present. This shell script is transparently added as an additional input to your Processing Job, so the actual ProcessingJob entrypoint is shell, but then it invokes your "real" script with Python.

I believe FrameworkProcessor behaviour might be working correctly with the new PipelineSession based syntax but broken with the old processor= syntax which it looks like you're using.

Could you try restructuring your pipeline definition code something like:

from sagemaker.workflow.pipeline_context import PipelineSession

pipeline_session = PipelineSession()

tensorflow_processor = TensorFlowProcessor(
    ...,
    sagemaker_session=pipeline_session,
)

step = ProcessingStep(
    name="evaluate-model",
    step_args=tensorflow_processor.run(
        inputs=[...],
        outputs=[...],
        code="evaluation.py",
        ...
    )
)

The new syntax reduces the amount of code change required for toggling between manually creating jobs and building pipelines (it's the same .run(...) call, just used with a pipeline session). I believe when FrameworkProcessor is used the old way, it might not be correctly creating & inserting the shell script entrypoint.

@svpino
Copy link
Author

svpino commented Jul 28, 2023

Hey @athewsey, yes, that seems to be the problem. Everything works fine when using the new PipelineSession based-syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants