Skip to content

SageMaker Processing Job doesn't support FastFile Input Mode #4711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
joonb14 opened this issue May 31, 2024 · 3 comments
Open

SageMaker Processing Job doesn't support FastFile Input Mode #4711

joonb14 opened this issue May 31, 2024 · 3 comments
Assignees
Labels
component: processing Relates to the SageMaker Processing Platform type: bug

Comments

@joonb14
Copy link

joonb14 commented May 31, 2024

SageMaker Processing Job doesn't support FastFile Input Mode
This might be the issue only for the step function, and python-sdk might provide the FastFile mode. However it doesn't make sense to me that one provides functionality and one doesn't. So, I post this issue.

According to this document, the ProcessingInput accepts FastFile mode. However If I try to create Processing Job using Step Function, error occurs.

"Error": "SageMaker.AmazonSageMakerException",
"Cause": "1 validation error detected: Value 'FastFile' at 'processingInputs.2.member.s3Input.s3InputMode' failed to satisfy constraint: Member must satisfy enum value set: [Pipe, File] (Service: AmazonSageMaker; Status Code: 400; Error Code: ValidationException; Request ID: 05f50214-59f0-4518-bd8e-36d800b078d0; Proxy: null)"

It only accepts Pipe, File.
This is related to this closed issue #3962
And the document is updated by this PR#4311

Step Function Configuration
image

{
  "ProcessingResources": {
    "ClusterConfig": {
      "InstanceCount": 1,
      "InstanceType.$": "$.inferenceOpt.instanceType",
      "VolumeSizeInGB": 100
    }
  },
  "AppSpecification": {
    "ImageUri.$": "$.inferenceOpt.imageUri",
    "ContainerEntrypoint": [
      "python3",
      "/opt/ml/code/inference.py"
    ]
  },
  "Environment": {
    "FINAL_KERNEL_SIZE.$": "$.hyperparameters.FINAL_KERNEL_SIZE",
    "MODEL_ARCH.$": "$.hyperparameters.MODEL_ARCH",
    "LAST_LEVEL.$": "$.hyperparameters.LAST_LEVEL",
    "IMG_SIZE.$": "$.hyperparameters.IMG_SIZE",
    "NUM_CLASSES.$": "$.hyperparameters.NUM_CLASSES",
    "CLASS_DICT.$": "$.hyperparameters.CLASS_DICT",
    "PREDICT_DATASET_ID.$": "$.hyperparameters.DATASET_ID",
    "S3_PREFIX.$": "$.hyperparameters.SERVICE_NAME",
    "PREDICT_ID.$": "$.hyperparameters.PREDICT_ID",
    "MANIFEST_FILE.$": "$.inferenceOpt.manifestFile",
    "QUEUE_URL": "https://sqs.ap-northeast-2.amazonaws.com/405240163678/Prod-Proto-CvOpsApplication-ProtoInferenceMetricQueueProtoInference-SgfJZK5ae1TV",
    "REGION": "ap-northeast-2"
  },
  "NetworkConfig": {
    "EnableNetworkIsolation": false,
    "EnableInterContainerTrafficEncryption": true,
    "VpcConfig": {
      "SecurityGroupIds": [
        "sg-0aee4467bff25504b"
      ],
      "Subnets": [
        "subnet-06dc8a4e813910a07",
        "subnet-0b023fbbb320cf8d8",
        "subnet-0e3b92aad033b1993"
      ]
    }
  },
  "ProcessingInputs": [
    {
      "InputName": "model",
      "S3Input": {
        "S3Uri.$": "$.inferenceOpt.modelArtifact",
        "LocalPath": "/opt/ml/processing/model",
        "S3DataType": "S3Prefix",
        "S3InputMode": "File",
        "S3DataDistributionType": "FullyReplicated",
        "S3CompressionType": "None"
      }
    },
    {
      "InputName": "dataset",
      "S3Input": {
        "S3Uri.$": "$.inferenceOpt.datastoreName",
        "LocalPath": "/opt/ml/processing/manifest",
        "S3DataType": "S3Prefix",
        "S3InputMode": "FastFile",
        "S3DataDistributionType": "FullyReplicated",
        "S3CompressionType": "None"
      }
    },
    {
      "InputName": "manifest",
      "S3Input": {
        "S3Uri.$": "$.inferenceOpt.manifestFile",
        "LocalPath": "/opt/ml/processing/config",
        "S3DataType": "S3Prefix",
        "S3InputMode": "File",
        "S3DataDistributionType": "FullyReplicated",
        "S3CompressionType": "None"
      }
    }
  ],
  "RoleArn": "arn:aws:iam::405240163678:role/Prod-Proto-CvOpsApplicati-ProtoInferenceWorkflowPro-Kh3pe5GGN20F",
  "ProcessingJobName.$": "States.Format('cvops-inference-{}', $$.Execution.Name)",
  "Tags": [
    {
      "Key": "CVOps",
      "Value": "Proto-CvopsInferenceTask"
    }
  ]
}

Expected behavior
Accepting the FastFile mode, or updating the document

@C24IO C24IO added the component: processing Relates to the SageMaker Processing Platform label Jun 1, 2024
@giamic
Copy link

giamic commented Aug 27, 2024

I can confirm that FastFile doesn't work in the python API either.

@ianbenlolo
Copy link

Any updates on this? It is march 2025 and with sagemaker==2.240.0 i still get

ClientError: An error occurred (ValidationException) when calling the CreatePipeline operation: Unable to parse pipeline definition. Model Validation failed: Value 'FastFile' for 'ProcessingS3InputMode' failed to satisfy enum 
value set: [Pipe, File]

@nargokul nargokul added component: pipelines Relates to the SageMaker Pipeline Platform and removed component: processing Relates to the SageMaker Processing Platform labels Apr 14, 2025
@brockwade633
Copy link
Contributor

brockwade633 commented Apr 17, 2025

This is an issue with Processing job schema, not pipelines. Pipelines validates against Processing's model.

@brockwade633 brockwade633 added component: processing Relates to the SageMaker Processing Platform and removed component: pipelines Relates to the SageMaker Pipeline Platform labels Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: processing Relates to the SageMaker Processing Platform type: bug
Projects
None yet
Development

No branches or pull requests

7 participants