-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feature: Inferentia Neuron support for HuggingFace #2976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jeniyat that's great.🤗 I did also some testing a noticed a few things we should attack.
it is possible to use image_uris.retrieve
for not supported py_version.
import sagemaker, boto3
sagemaker.image_uris.retrieve(
framework="huggingface",
region=boto3.Session().region_name,
version="4.12.3",
py_version="py38",
base_framework_version="pytorch1.9.1",
instance_type="neuron",
image_scope="inference"
)
return '763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference-neuron:1.9.1-transformers4.12.3-neuron-py38-sdk1.17.1-ubuntu18.04-v1.0'
but this image doesn't exist. That's quite critical since the current py_version
for the default pytorch image is py38
so it's very likely that customers who want to switch from regular to inferentia might make a mistake and run into an error when deploying.
Would it make sense for solving this as well as the comment with upgrading the neuron SDK to create a new huggingface_neuron.json
with a similar structure and use this to create the image uri?
In addition to this i tested to use the HuggingFaceModel
to deploy.
from sagemaker.huggingface import HuggingFaceModel
import sagemaker
import boto3
sess = sagemaker.Session()
role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
model_data="s3://hf-sagemaker-inference/inferentia/model.tar.gz",
transformers_version='4.12',
pytorch_version='1.9',
py_version='py37',
role=role,
sagemaker_session=sess
)
# Let SageMaker know that we've already compiled the model via neuron-cc
huggingface_model._is_compiled_model = True
running huggingface_model.prepare_container_def("ml.inf.xlarge")
returns the following error
ValueError: Unsupported processor: inf. You may need to upgrade your SDK version (pip install -U sagemaker) for newer processors. Supported processor(s): gpu, cpu, neuron.
running predictor = huggingface_model.deploy(initial_instance_count=1, instance_type="ml.inf1.xlarge")
it returns
TypeError: expected string or bytes-like object
I guess it is coming through huggingface_model._is_compiled_model = True
, but i defined it since the model was compiled outside of Sagemaker.
When i set it to false
huggingface_model._is_compiled_model = False
predictor = huggingface_model.deploy(initial_instance_count=1, instance_type="ml.inf1.xlarge")
I get the same error as for huggingface_model.prepare_container_def("ml.inf.xlarge")
Your model is not compiled. Please compile your model before using Inferentia.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-26-28b5559c63e4> in <module>
1 huggingface_model._is_compiled_model = False
2
----> 3 predictor = huggingface_model.deploy(initial_instance_count=1, instance_type="ml.inf1.xlarge")
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, **kwargs)
971 self._base_name = "-".join((self._base_name, compiled_model_suffix))
972
--> 973 self._create_sagemaker_model(instance_type, accelerator_type, tags)
974
975 serverless_inference_config_dict = (
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model.py in _create_sagemaker_model(self, instance_type, accelerator_type, tags)
516 /api/latest/reference/services/sagemaker.html#SageMaker.Client.add_tags
517 """
--> 518 container_def = self.prepare_container_def(instance_type, accelerator_type=accelerator_type)
519
520 self._ensure_base_name_if_needed(container_def["Image"])
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/huggingface/model.py in prepare_container_def(self, instance_type, accelerator_type)
268 region_name = self.sagemaker_session.boto_session.region_name
269 deploy_image = self.serving_image_uri(
--> 270 region_name, instance_type, accelerator_type=accelerator_type
271 )
272
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/huggingface/model.py in serving_image_uri(self, region_name, instance_type, accelerator_type)
311 accelerator_type=accelerator_type,
312 image_scope="inference",
--> 313 base_framework_version=base_framework_version,
314 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/image_uris.py in retrieve(framework, region, version, py_version, instance_type, accelerator_type, image_scope, container_version, distribution, base_framework_version, training_compiler_config, model_id, model_version, tolerate_vulnerable_model, tolerate_deprecated_model)
151
152 processor = _processor(
--> 153 instance_type, config.get("processors") or version_config.get("processors")
154 )
155
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/image_uris.py in _processor(instance_type, available_processors)
338 )
339
--> 340 _validate_arg(processor, available_processors, "processor")
341 return processor
342
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/image_uris.py in _validate_arg(arg, available_options, arg_name)
388 "Unsupported {arg_name}: {arg}. You may need to upgrade your SDK version "
389 "(pip install -U sagemaker) for newer {arg_name}s. Supported {arg_name}(s): "
--> 390 "{options}.".format(arg_name=arg_name, arg=arg, options=", ".join(available_options))
391 )
392
ValueError: Unsupported processor: inf. You may need to upgrade your SDK version (pip install -U sagemaker) for newer processors. Supported processor(s): gpu, cpu, neuron.
@@ -913,7 +913,7 @@ | |||
"us-west-2": "763104351884" | |||
}, | |||
"repository": "huggingface-pytorch-inference", | |||
"container_version": {"gpu": "cu111-ubuntu20.04", "cpu": "ubuntu20.04" } | |||
"container_version": {"gpu": "cu111-ubuntu20.04", "cpu": "ubuntu20.04", "neuron": "sdk1.17.1-ubuntu18.04-v1.0" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this work if release a new DLC for neuron SDK 1.18.0
for the same framework version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If sdk version is not passed as input argument, then it the retrieve will pick the latest sdk version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a different json config file altogether for huggingface-neuron
rather than bloating vanilla huggingface one
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
1 similar comment
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to test the updates again using the HuggingFaceModel
, which now creates the following error
ValueError: Unsupported Python version: py37. You may need to upgrade your SDK version (pip install -U sagemaker) for newer Python versions. Supported Python version(s): py38.
"us-west-2": "763104351884" | ||
}, | ||
"container_version": {"neuron": "ubuntu18.04"}, | ||
"repo_versions": ["v1.0"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this refer to -v1.0
in 763104351884.dkr.ecr.us-west-2.amazonaws.com/huggingface-pytorch-inference-neuron:1.9.1-transformers4.12.3-neuron-py37-sdk1.17.1-ubuntu18.04-v1.0
? if so we should remove this. The Image has multiple available tags see: https://github.com/aws/deep-learning-containers/releases/tag/v1.0-hf-4.12.3-pt-1.9.1-inf-neuron-sdk1.17.1-py37 one of them is 1.9.1-transformers4.12.3-neuron-py37-sdk1.17.1-ubuntu18.04
(just without the v1.0
).
The v1.0
indicates the version of the container and the assimov team might update the container to apply security patches then we would need update this as well its a better practice to always use the latest version through 1.9.1-transformers4.12.3-neuron-py37-sdk1.17.1-ubuntu18.04
tag and remove repo_version
src/sagemaker/image_uris.py
Outdated
@@ -47,6 +46,8 @@ def retrieve( | |||
model_version=None, | |||
tolerate_vulnerable_model=False, | |||
tolerate_deprecated_model=False, | |||
sdk_version=None, | |||
repo_version=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can then be removed
src/sagemaker/image_uris.py
Outdated
if not repo_version: | ||
repo_version = _get_latest_versions(version_config["repo_versions"]) | ||
container_version = sdk_version + "-" + container_version + "-" + repo_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
While doing some more tests i noticed that you need to provide the otherwise, you ll get ---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-4-b74825f4782f> in <module>
14 predictor = huggingface_model.deploy(
15 initial_instance_count=1, # number of instances
---> 16 instance_type="ml.inf1.xlarge" # AWS Inferentia Instance
17 )
18
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/huggingface/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, **kwargs)
265 if instance_type.startswith("ml.inf") and not self.image_uri:
266 self.image_uri = self.serving_image_uri(
--> 267 region_name=self.sagemaker_session.boto_session.region_name,
268 instance_type=instance_type,
269 )
AttributeError: 'NoneType' object has no attribute 'boto_session' according to the documentation,
|
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Handled this case in latest revision |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Issue #, if available:
Description of changes:
Added the necessary changes to incorporate the inf-neuron/hf image into the huggigface framework
Testing done:
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.