Unable to deploy huggingface-llm 1.3.3 #4332

LvffY · 2023-12-17T11:24:31Z

Describe the bug

I'd like to deploy mistral 0.2 LLM on sagemaker it seems that we need to have the hugging face llm version 1.3.3. For now the huggingface-llm is limited to some versions that does not include this version.

To reproduce

Run the following code :

#!/usr/bin/env python3
import json
import re

import boto3
import sagemaker
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client("iam")
    role = iam.get_role(RoleName="exec-role")["Role"][
        "Arn"
    ]

# Hub Model configuration. https://huggingface.co/models
hub = {
    "HF_MODEL_ID": "mistralai/Mistral-7B-Instruct-v0.2",
    "SM_NUM_GPUS": json.dumps(1),
    # "HF_MODEL_QUANTIZE": "gptq",
    # 'HF_TASK':'question-answering',
    # Enable to have long input length, and override default sagemaker values
    # See https://github.com/facebookresearch/llama/issues/450#issuecomment-1645247796
    "MAX_INPUT_LENGTH": json.dumps(4095),
    "MAX_TOTAL_TOKENS": json.dumps(4096),
}

# Ensure endpoint name will be compliant for AWS
regex = r"[^\-a-zA-Z0-9]+"

compliant_name = re.sub(regex, "-", hub["HF_MODEL_ID"])

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    # Here we'd like to have at least 1.3.3
    # See https://github.com/huggingface/text-generation-inference/issues/1342
    image_uri=get_huggingface_llm_image_uri("huggingface", version="1.3.3"),
    env=hub,
    role=role,
    name=compliant_name,
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.2xlarge",
    container_startup_health_check_timeout=300,
    endpoint_name=compliant_name,
)

Expected behavior

Being able to deploy the huggingface llm version 1.3.3.

Screenshots or logs

System information
A description of your system. Please provide:

SageMaker Python SDK version: 2.200.1
Framework name (eg. PyTorch) or algorithm (eg. KMeans): huggingface-llm (PyTorch TGI inference)
Framework version:
Python version: 3.10
CPU or GPU: GPU
Custom Docker image (Y/N): N

Additional context

If it's a quick fix I could probably help for the PR if needed.

cfregly · 2023-12-18T21:18:14Z

I believe @philschmid mentions that we're waiting for this PR to be accepted: #4314

LvffY · 2023-12-18T22:05:34Z

@cfregly I don't think we're waiting for the same version because @philschmid seemed to wait for 1.3.1 version while I'D like to see the 1.3.3 version

But I may look into this PR to see if I can update the sdk myself if anyone answer here :)

philschmid · 2023-12-19T08:24:01Z

1.3.1 should be released in the sdk by now. #4314

@LvffY can you share why 1.3.3? This could help us accelerate the release

LvffY · 2023-12-19T13:08:23Z

@philschmid The main idea is to be able to run Mistral 0.2 models. For now, with all supported versions are throwing the issue described in the huggingface repostory.

Looking at the comments, we see that this should be fixed with this PR which is included in the latest released version 1.3.3

cfregly · 2023-12-19T14:24:02Z

Confirmed that 1.3.1 (SageMaker Python SDK 2.200.1) still throws the same error.

amzn-choeric · 2023-12-19T18:09:41Z

Noting that the reason we are not able to fetch a v1.3.3 image through the SDK is because there is no actual DLC release in itself for v1.3.3. It is not a bug in the SDK from what I have read so far.

LvffY · 2023-12-19T18:17:50Z

Noting that the reason we are not able to fetch a v1.3.3 image through the SDK is because there is no actual DLC release in itself for v1.3.3. It is not a bug in the SDK from what I have read so far.

So what should be the way to go here ?

amzn-choeric · 2023-12-19T19:19:43Z

We will release 1.3.3 through DLC with an ETA of tomorrow.

The associated SDK change can be tracked through #4335. However, you can also just reference the image URIs specifically to avoid waiting for an SDK release. Available tags and sample image URI can be found here: https://github.com/aws/deep-learning-containers/releases?q=tgi&expanded=true.

pangyiwei · 2024-01-05T13:50:03Z

@amzn-choeric Is it possible to release 1.3.4 through DLC as well?

1.3.4 has a fix for this issue which will allow some Mistral models (with flash attention v2) to run on instances with non Ampere GPU

amzn-choeric · 2024-01-05T13:52:50Z

I believe HuggingFace has requested that we hold until they are able to merge the fixes in for huggingface/text-generation-inference#1334.

Michellehbn · 2024-01-10T13:55:44Z

hi! A fix has been applied for huggingface/text-generation-inference#1334. cc @philschmid

amzn-choeric · 2024-01-10T17:20:09Z

I believe that would need to be included in a new release version as deemed appropriate where we can then discuss with HuggingFace about the next steps after.

With regards to the issue at hand, though, it does look like the SDK change has been merged. Thus, closing the issue.

LvffY added the type: bug label Dec 17, 2023

LvffY mentioned this issue Dec 17, 2023

notebooks/deploy-mixtral.ipynb issue philschmid/llm-sagemaker-sample#8

Open

mohanasudhan added the HuggingFace label Dec 19, 2023

amzn-choeric closed this as completed Jan 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to deploy huggingface-llm 1.3.3 #4332

Unable to deploy huggingface-llm 1.3.3 #4332

LvffY commented Dec 17, 2023

cfregly commented Dec 18, 2023

Uh oh!

LvffY commented Dec 18, 2023

Uh oh!

philschmid commented Dec 19, 2023

Uh oh!

LvffY commented Dec 19, 2023

Uh oh!

cfregly commented Dec 19, 2023

Uh oh!

amzn-choeric commented Dec 19, 2023

Uh oh!

LvffY commented Dec 19, 2023

Uh oh!

amzn-choeric commented Dec 19, 2023

Uh oh!

pangyiwei commented Jan 5, 2024

Uh oh!

amzn-choeric commented Jan 5, 2024

Uh oh!

Michellehbn commented Jan 10, 2024

Uh oh!

amzn-choeric commented Jan 10, 2024

Uh oh!

Unable to deploy huggingface-llm 1.3.3 #4332

Unable to deploy huggingface-llm 1.3.3 #4332

Comments

LvffY commented Dec 17, 2023

cfregly commented Dec 18, 2023

Uh oh!

LvffY commented Dec 18, 2023

Uh oh!

philschmid commented Dec 19, 2023

Uh oh!

LvffY commented Dec 19, 2023

Uh oh!

cfregly commented Dec 19, 2023

Uh oh!

amzn-choeric commented Dec 19, 2023

Uh oh!

LvffY commented Dec 19, 2023

Uh oh!

amzn-choeric commented Dec 19, 2023

Uh oh!

pangyiwei commented Jan 5, 2024

Uh oh!

amzn-choeric commented Jan 5, 2024

Uh oh!

Michellehbn commented Jan 10, 2024

Uh oh!

amzn-choeric commented Jan 10, 2024

Uh oh!