Server reruns same task multiple times #133

kurtgdl · 2024-12-04T04:39:05Z

I used

deploy = HuggingFaceModel(
  name=model_name,
  role=role,
  code_location="abc",
  model_data=path_to_s3,
  transformers_version="4.37",  
  pytorch_version="2.1",       
  py_version='py310',
  model_server_workers=1,
)
emb = deploy.deploy(
  endpoint_name=model_name,
  initial_instance_count=1,
  instance_type="ml.c5.4xlarge",
  container_startup_health_check_timeout=300,
)

The custom script was

def model_fn(model_dir):
    processor = DataProcess() # A class that contains logic for processing each file.
    return processor

def predict_fn(data, model):
    text = model.process_file(data)
    return {"output": text}

The input data is a base64 string of a file content.
It's strange that when the file is pretty small, under 1MB, the server runs model_fn and predict_fn once, and the process took around 30 seconds. But when I inputted large file of around 1.5MB, it runs model_fn and predict_fn multiple times, each time lasting around 2mins. I know this because the same request gives multiple contents of

 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Preprocess time - 5.128383636474609 ms
 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Predict time - 162199.17178153992 ms
 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Postprocess time - 0.00762939453125 ms

It's probably unorthodox to use the server for the data processing job. But what configs did I miss?

Related: aws/amazon-sagemaker-examples#1073

The text was updated successfully, but these errors were encountered:

kurtgdl mentioned this issue Dec 7, 2024

model_fn and input_fn called multiple times aws/amazon-sagemaker-examples#1073

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Server reruns same task multiple times #133

Server reruns same task multiple times #133

kurtgdl commented Dec 4, 2024 •

edited

Loading

Server reruns same task multiple times #133

Server reruns same task multiple times #133

Comments

kurtgdl commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

kurtgdl commented Dec 4, 2024 •

edited

Loading