-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Unable to deploy huggingface-llm 1.3.3 #4332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I believe @philschmid mentions that we're waiting for this PR to be accepted: #4314 ![]() |
@cfregly I don't think we're waiting for the same version because @philschmid seemed to wait for 1.3.1 version while I'D like to see the 1.3.3 version But I may look into this PR to see if I can update the sdk myself if anyone answer here :) |
@philschmid The main idea is to be able to run Mistral 0.2 models. For now, with all supported versions are throwing the issue described in the huggingface repostory. Looking at the comments, we see that this should be fixed with this PR which is included in the latest released version 1.3.3 |
Confirmed that 1.3.1 (SageMaker Python SDK 2.200.1) still throws the same error. |
Noting that the reason we are not able to fetch a v1.3.3 image through the SDK is because there is no actual DLC release in itself for v1.3.3. It is not a bug in the SDK from what I have read so far. |
So what should be the way to go here ? |
We will release 1.3.3 through DLC with an ETA of tomorrow. The associated SDK change can be tracked through #4335. However, you can also just reference the image URIs specifically to avoid waiting for an SDK release. Available tags and sample image URI can be found here: https://github.com/aws/deep-learning-containers/releases?q=tgi&expanded=true. |
@amzn-choeric Is it possible to release 1.3.4 through DLC as well? 1.3.4 has a fix for this issue which will allow some Mistral models (with flash attention v2) to run on instances with non Ampere GPU |
I believe HuggingFace has requested that we hold until they are able to merge the fixes in for huggingface/text-generation-inference#1334. |
hi! A fix has been applied for huggingface/text-generation-inference#1334. cc @philschmid |
I believe that would need to be included in a new release version as deemed appropriate where we can then discuss with HuggingFace about the next steps after. With regards to the issue at hand, though, it does look like the SDK change has been merged. Thus, closing the issue. |
Describe the bug
I'd like to deploy mistral 0.2 LLM on sagemaker it seems that we need to have the hugging face llm version 1.3.3. For now the huggingface-llm is limited to some versions that does not include this version.
To reproduce
Run the following code :
Expected behavior
Being able to deploy the huggingface llm version 1.3.3.
Screenshots or logs
System information
A description of your system. Please provide:
Additional context
If it's a quick fix I could probably help for the PR if needed.
The text was updated successfully, but these errors were encountered: