Skip to content

SageMaker processing step not finding /opt/ml/processing/input/code/ #2909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
calvin0112 opened this issue Feb 7, 2022 · 2 comments
Closed
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug

Comments

@calvin0112
Copy link

calvin0112 commented Feb 7, 2022

Hi,

I'm using XGBoostProcessor from the SageMaker Python SDK for a ProcessingStep in my SageMaker pipeline. When running the pipeline from a Jupyter notebook in SageMaker Studio, I'm getting the following error:

    /opt/ml/processing/input/entrypoint/runproc.sh: line 3: cd: /opt/ml/processing/input/code/: No such file or directory
    tar (child): sourcedir.tar.gz: Cannot open: No such file or directory

This is from the script runproc.sh, which is generated by XGBoostProcessor. It looks like the script is trying to go to the directory "/opt/ml/processing/input/code/" to unpack the code to run for the processing but can't find the directory. Here is my Python code for my pipeline:

    BASE_DIR = os.path.dirname(os.path.realpath(__file__))
    
    ...
    
        train_processor = XGBoostProcessor(
            framework_version="1.3-1",
            command=["python3"],
            instance_type=processing_instance_type,
            instance_count=1,
            base_job_name=f"{base_job_prefix}/script-sc-train",
            sagemaker_session=sagemaker_session,
            role=role
        )
    
        train_something_run_args = train_processor.get_run_args(
            code=os.path.join(BASE_DIR, "train_something.py"),
            source_dir=BASE_DIR,
            arguments=[
                '--input_table', SOMETHING_INPUT_TABLE,
                '--s3_storage_bucket', S3_STORAGE_BUCKET,
                '--model_file_path', S3_MODEL_PREFIX + f"/{SOMETHING_MODEL_NAME}_model.pkl"
            ]
        )
    
        step_train_something = ProcessingStep(
            name="TrainSomethingModel",
            processor=train_processor,
            code=train_something_run_args.code,
            job_arguments=train_something_run_args.arguments
        )

The script "train_something.py" is the code that I need to run for the processing step, and BASE_DIR is the directory with the dependencies.

I tried adding a ProcessingInput with "/opt/ml/processing/input/code" as the destination for the RunArgs, but it didn't help:

    train_something_run_args = train_processor.get_run_args(
        code=os.path.join(BASE_DIR, "train_something.py"),
        source_dir=BASE_DIR,
        inputs=[ProcessingInput(source=BASE_DIR, destination="/opt/ml/processing/input/code")],
        arguments=[
            '--input_table', SOMETHING_INPUT_TABLE,
            '--s3_storage_bucket', S3_STORAGE_BUCKET,
            '--model_file_path', S3_MODEL_PREFIX + f"/{SOMETHING_MODEL_NAME}_model.pkl"
        ]
    )
    
    step_train_something = ProcessingStep(
        name="TrainSomethingModel",
        processor=train_processor,
        code=train_something_run_args.code,
        inputs=train_something_run_args.inputs,
        job_arguments=train_something_run_args.arguments
    )

With the ProcessingInput, I'm still getting the same error. I've confirmed that the script runproc.sh and the code archive sourcedir.tar.gz are in the S3 bucket.

I would appreciate any help with this. I found an issue regarding the broken integration between FrameworkProcessor and ProcessingStep (#2656). Is it related?

Thanks,
C

@aoguo64 aoguo64 added the component: pipelines Relates to the SageMaker Pipeline Platform label Feb 17, 2022
@jerrypeng7773
Copy link
Contributor

@calvin0112 this might be some bug from processing job based on issue #2656, I already reached out to our internal team about this.

At the meanwhile, we introduced a new way to construct step, and you can give it a shot to see if it works?

from sagemaker.workflow.pipeline_context import PipelineSession

session = PipelineSession()

processor = XGBoostProcessor(..., sagemaker_session=session)

step_args = processor.run(code=..., source_dir=..., arguments=....)

step_sklearn = ProcessingStep(
    name="MyProcessingStep",
    step_args=step_args,
)

In summary, we introduced the PipelineSession. This special session does not trigger a processing job immediately when you call processor.run , instead, it captures the request arguments required to run a processing job, and delegate it to the processing step to start the job later during pipeline execution.

Let us know.

@jerrypeng7773
Copy link
Contributor

closing this issue for now, please re-open to let us know if you have any other concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: pipelines Relates to the SageMaker Pipeline Platform type: bug
Projects
None yet
Development

No branches or pull requests

3 participants