Skip to content

S3DataSource not working with HuggingFaceModel #4248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
philschmid opened this issue Nov 7, 2023 · 16 comments · Fixed by #4276
Closed

S3DataSource not working with HuggingFaceModel #4248

philschmid opened this issue Nov 7, 2023 · 16 comments · Fixed by #4276
Assignees

Comments

@philschmid
Copy link
Contributor

Describe the bug
Using S3DataSource dict as model_data is not working

To reproduce

from sagemaker.huggingface.model import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data={'S3DataSource':{'S3Uri': "s3://tmybuckaet",'S3DataType': 'S3Prefix','CompressionType': 'None'}},
   role=role,                      # iam role with permissions to create an Endpoint
   transformers_version="4.34.1",  # transformers version used
   pytorch_version="1.13.1",       # pytorch version used
   py_version='py310',              # python version used
   model_server_workers=1,         # number of workers for the model server
)

# Let SageMaker know that we've already compiled the model
huggingface_model._is_compiled_model = True

# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,      # number of instances
    instance_type="ml.inf2.xlarge", # AWS Inferentia Instance
    volume_size = 100
)

Expected behavior
Deploys my model

Screenshots or logs

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb) Cell 17 line 5
      [2](vscode-notebook-cell://ssh-remote%2Bc6i/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb#X35sdnNjb2RlLXJlbW90ZQ%3D%3D?line=1) huggingface_model._is_compiled_model = True
      [4](vscode-notebook-cell://ssh-remote%2Bc6i/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb#X35sdnNjb2RlLXJlbW90ZQ%3D%3D?line=3) # deploy the endpoint endpoint
----> [5](vscode-notebook-cell://ssh-remote%2Bc6i/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb#X35sdnNjb2RlLXJlbW90ZQ%3D%3D?line=4) predictor = huggingface_model.deploy(
      [6](vscode-notebook-cell://ssh-remote%2Bc6i/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb#X35sdnNjb2RlLXJlbW90ZQ%3D%3D?line=5)     initial_instance_count=1,      # number of instances
      [7](vscode-notebook-cell://ssh-remote%2Bc6i/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb#X35sdnNjb2RlLXJlbW90ZQ%3D%3D?line=6)     instance_type="ml.inf2.xlarge", # AWS Inferentia Instance
      [8](vscode-notebook-cell://ssh-remote%2Bc6i/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb#X35sdnNjb2RlLXJlbW90ZQ%3D%3D?line=7)     volume_size = 100
      [9](vscode-notebook-cell://ssh-remote%2Bc6i/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/sagemaker-notebook.ipynb#X35sdnNjb2RlLXJlbW90ZQ%3D%3D?line=8) )

File [~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/huggingface/model.py:311](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/huggingface/model.py:311), in HuggingFaceModel.deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, volume_size, model_data_download_timeout, container_startup_health_check_timeout, inference_recommendation_id, explainer_config, **kwargs)
    304     inference_tool = "neuron" if instance_type.startswith("ml.inf1") else "neuronx"
    305     self.image_uri = self.serving_image_uri(
    306         region_name=self.sagemaker_session.boto_session.region_name,
    307         instance_type=instance_type,
    308         inference_tool=inference_tool,
    309     )
--> 311 return super(HuggingFaceModel, self).deploy(
    312     initial_instance_count,
    313     instance_type,
    314     serializer,
    315     deserializer,
    316     accelerator_type,
    317     endpoint_name,
    318     tags,
    319     kms_key,
    320     wait,
    321     data_capture_config,
    322     async_inference_config,
    323     serverless_inference_config,
    324     volume_size=volume_size,
    325     model_data_download_timeout=model_data_download_timeout,
    326     container_startup_health_check_timeout=container_startup_health_check_timeout,
    327     inference_recommendation_id=inference_recommendation_id,
    328     explainer_config=explainer_config,
    329 )

File [~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/model.py:1434](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/model.py:1434), in Model.deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, volume_size, model_data_download_timeout, container_startup_health_check_timeout, inference_recommendation_id, explainer_config, **kwargs)
   1432 compiled_model_suffix = None if is_serverless else "-".join(instance_type.split(".")[:-1])
   1433 if self._is_compiled_model and not is_serverless:
-> 1434     self._ensure_base_name_if_needed(
   1435         image_uri=self.image_uri,
   1436         script_uri=self.source_dir,
   1437         model_uri=self.model_data,
   1438     )
   1439     if self._base_name is not None:
   1440         self._base_name = "-".join((self._base_name, compiled_model_suffix))

File [~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/model.py:888](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/model.py:888), in Model._ensure_base_name_if_needed(self, image_uri, script_uri, model_uri)
    881 """Create a base name from the image URI if there is no model name provided.
    882 
    883 If a JumpStart script or model uri is used, select the JumpStart base name.
    884 """
    885 if self.name is None:
    886     self._base_name = (
    887         self._base_name
--> 888         or get_jumpstart_base_name_if_jumpstart_model(script_uri, model_uri)
    889         or utils.base_name_from_image(image_uri, default_base_name=Model.__name__)
    890     )

File [~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/jumpstart/utils.py:338](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/jumpstart/utils.py:338), in get_jumpstart_base_name_if_jumpstart_model(*uris)
    330 """Return default JumpStart base name if a URI belongs to JumpStart.
    331 
    332 If no URIs belong to JumpStart, return None.
   (...)
    335     *uris (Optional[str]): URI to test for association with JumpStart.
    336 """
    337 for uri in uris:
--> 338     if is_jumpstart_model_uri(uri):
    339         return constants.JUMPSTART_RESOURCE_BASE_NAME
    340 return None

File [~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/jumpstart/utils.py:250](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/site-packages/sagemaker/jumpstart/utils.py:250), in is_jumpstart_model_uri(uri)
    243 """Returns True if URI corresponds to a JumpStart-hosted model.
    244 
    245 Args:
    246     uri (Optional[str]): uri for inference/training job.
    247 """
    249 bucket = None
--> 250 if urlparse(uri).scheme == "s3":
    251     bucket, _ = parse_s3_url(uri)
    253 return bucket in constants.JUMPSTART_GATED_AND_PUBLIC_BUCKET_NAME_SET

File [~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:392](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:392), in urlparse(url, scheme, allow_fragments)
    372 def urlparse(url, scheme='', allow_fragments=True):
    373     """Parse a URL into 6 components:
    374     <scheme>://<netloc>/<path>;<params>?<query>#<fragment>
    375 
   (...)
    390     Note that % escapes are not expanded.
    391     """
--> 392     url, scheme, _coerce_result = _coerce_args(url, scheme)
    393     splitresult = urlsplit(url, scheme, allow_fragments)
    394     scheme, netloc, url, query, fragment = splitresult

File [~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:128](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:128), in _coerce_args(*args)
    126 if str_input:
    127     return args + (_noop,)
--> 128 return _decode_args(args) + (_encode_result,)

File [~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:112](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:112), in _decode_args(args, encoding, errors)
    110 def _decode_args(args, encoding=_implicit_encoding,
    111                        errors=_implicit_errors):
--> 112     return tuple(x.decode(encoding, errors) if x else '' for x in args)

File [~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:112](https://vscode-remote+ssh-002dremote-002bc6i.vscode-resource.vscode-cdn.net/home/ubuntu/huggingface-inferentia2-samples/stable-diffusion-xl/~/miniconda3/envs/dev/lib/python3.9/urllib/parse.py:112), in <genexpr>(.0)
    110 def _decode_args(args, encoding=_implicit_encoding,
    111                        errors=_implicit_errors):
--> 112     return tuple(x.decode(encoding, errors) if x else '' for x in args)

AttributeError: 'dict' object has no attribute 'decode'

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.197.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): pytorch
  • Framework version: 1.13.1
  • Python version: 3.10
  • CPU or GPU: Inf2
  • Custom Docker image (Y/N): N

Additional context
Add any other context about the problem here.

@martinRenou
Copy link
Collaborator

Thank you for opening an issue, are you able to say if this worked at some point? If yes, could you tell which version of the SDK that was?

@philschmid
Copy link
Contributor Author

No, I cannot. AFAIK the S3DataSource as model_data is quite new.

@martinRenou
Copy link
Collaborator

I'm looking into this

@martinRenou
Copy link
Collaborator

I am wondering if you would reach the result you want with the following code:

from sagemaker.huggingface.model import HuggingFaceModel

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data="s3://tmybuckaet",
   role=role,                      # iam role with permissions to create an Endpoint
   transformers_version="4.34.1",  # transformers version used
   pytorch_version="1.13.1",       # pytorch version used
   py_version='py310',              # python version used
   model_server_workers=1,         # number of workers for the model server
)

# Let SageMaker know that we've already compiled the model
huggingface_model._is_compiled_model = True

# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,      # number of instances
    instance_type="ml.inf2.xlarge", # AWS Inferentia Instance
    volume_size = 100
)

In the PR #4005 it seems more freedom was added to the model_data parameter in estimator, though the same was not applied to the Model class. We may want to homogenize this and have model_data accept a string (in that case that would be the value for S3Uri) or a dictionary (in that case that would be the manually created S3DataSource dictionary).

Does that make sense?

@philschmid
Copy link
Contributor Author

How does that work if the model was trained outside of sagemaker? Doesn't the PR look for training_job_spec["ModelArtifacts"]["S3ModelArtifacts"] which is not available in mycase?

But yes if i can provide a "s3uri" which points to a "directory" this would be perfect. I just want to avoid creating the model.tar.gz

@philschmid
Copy link
Contributor Author

I skipped through the PR and it seems for "Model" you still need to provide the "dict" with the S3DataSource

@martinRenou
Copy link
Collaborator

Would you be able to locally test the changes from the PR #4276 ?

pip install git+https://github.com/martinRenou/sagemaker-python-sdk.git@s3datasource Should get the code

@philschmid
Copy link
Contributor Author

Successfully tested your pr with

   model_data={'S3DataSource':{'S3Uri':"s3://sagemaker-us-east-2-558105141721/neuronx/embeddings/",'S3DataType': 'S3Prefix','CompressionType': 'None'}},

@martinRenou
Copy link
Collaborator

Good! So you're confirming that it behaves correctly?

@philschmid
Copy link
Contributor Author

Yeah i deployed a model and run inference.

@martinRenou
Copy link
Collaborator

Nice thanks for checking this.

@knikure knikure linked a pull request Dec 8, 2023 that will close this issue
9 tasks
@tleyden
Copy link

tleyden commented Dec 12, 2023

I'm also seeing an issue when trying to use an S3DataSource with an HF model:

ParamValidationError: Parameter validation failed:
Unknown parameter in PrimaryContainer: "ModelDataSource", must be one of: ContainerHostname, Image, ImageConfig, Mode, ModelDataUrl, Environment, ModelPackageName, InferenceSpecificationName, MultiModelConfig

Does this look like the same issue but with a slightly different error?

The code I'm using to load it is:

llm_model = HuggingFaceModel(
      role=role_arn,
      image_uri=llm_image,
      model_data={'S3DataSource':{'S3Uri': model_s3_path,'S3DataType': 'S3Prefix','CompressionType': 'None'}},
      env=config
)

where model_s3_path is a valid S3Uri.

Here's the full stack trace:
llama-fine-tune-error.log

@tleyden
Copy link

tleyden commented Dec 12, 2023

I'm also seeing an issue when trying to use an S3DataSource with an HF model:

It turns out this is issue only happening when running the code from within a lambda function, so it seems unrelated to this issue.

@martinRenou
Copy link
Collaborator

@tleyden thank you for commenting, would you be able to open a separate issue for this?

@tleyden
Copy link

tleyden commented Dec 13, 2023

@martinRenou I found the cause - an older (ancient) version of boto was installed. I will open a separate issue since it should probably make the boto version requirement explicit

@tleyden
Copy link

tleyden commented Dec 13, 2023

@martinRenou #4321

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants