Specifying image_uri in PyTorchModel gives TypeError when running deploy #2202

lc-billyfung · 2021-03-10T21:00:14Z

Describe the bug
When creating a PyTorchModel and deploying to endpoint, using a specified image_uri, the model object is has attribute self.framework_version=None. In the check for _is_mms_version this will cause an error because of running a regex search with an input of type None instead of string or byte.

To reproduce

model = PyTorchModel(model_data=model_artifact,
                   name=name_from_base('model'),
                   role=role, 
                   entry_point="torchserve-predictor.py",
                   image_uri="763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:1.7.1-cpu-py36-ubuntu18.04",
                   )

predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge', endpoint_name=endpoint_name)

Expected behavior
I expect the behavior to be the same as when providing framework_version and py_version into the creation of a PyTorchModel

Screenshots or logs

~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, **kwargs)
    740                 self._base_name = "-".join((self._base_name, compiled_model_suffix))
    741 
--> 742         self._create_sagemaker_model(instance_type, accelerator_type, tags)
    743         production_variant = sagemaker.production_variant(
    744             self.name, instance_type, initial_instance_count, accelerator_type=accelerator_type

~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/model.py in _create_sagemaker_model(self, instance_type, accelerator_type, tags)
    306                 /api/latest/reference/services/sagemaker.html#SageMaker.Client.add_tags
    307         """
--> 308         container_def = self.prepare_container_def(instance_type, accelerator_type=accelerator_type)
    309 
    310         self._ensure_base_name_if_needed(container_def["Image"])

~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/pytorch/model.py in prepare_container_def(self, instance_type, accelerator_type)
    237 
    238         deploy_key_prefix = model_code_key_prefix(self.key_prefix, self.name, deploy_image)
--> 239         self._upload_code(deploy_key_prefix, repack=self._is_mms_version())
    240         deploy_env = dict(self.env)
    241         deploy_env.update(self._framework_env_vars())

~/.pyenv/versions/lib/python3.6/site-packages/sagemaker/pytorch/model.py in _is_mms_version(self)
    282         """
    283         lowest_mms_version = packaging.version.Version(self._LOWEST_MMS_VERSION)
--> 284         framework_version = packaging.version.Version(self.framework_version)
    285         return framework_version >= lowest_mms_version

~/.pyenv/versions/lib/python3.6/site-packages/packaging/version.py in __init__(self, version)
    294 
    295         # Validate the version and parse it into pieces
--> 296         match = self._regex.search(version)
    297         if not match:
    298             raise InvalidVersion("Invalid version: '{0}'".format(version))

TypeError: expected string or bytes-like object

System information
A description of your system. Please provide:

SageMaker Python SDK version: 2.29.1
Framework name (eg. PyTorch) or algorithm (eg. KMeans): Pytorch
Framework version: 1.7.1
Python version: 3.6.12
CPU or GPU: CPU
Custom Docker image (Y/N): N

Thanks

The text was updated successfully, but these errors were encountered:

purplexed · 2021-05-21T09:46:30Z

I was able to replicate the bug with the following system information:

SageMaker Python SDK version: 2.41.0
Framework name (eg. PyTorch) or algorithm (eg. KMeans): Pytorch
Framework version: 1.7.1
Python version: 3.6.12
CPU or GPU: CPU
Custom Docker image (Y/N): N

Also reported it to AWS support on the 20th of May.

johann-petrak · 2021-06-21T13:26:56Z

Affects me as well, workaround seems to be to just provide a dummy version, but an annoying bug all the same.

zorrofox · 2021-08-24T10:56:29Z

Same to me!

oborchers · 2021-08-27T10:02:55Z

Same for me. For the huggingface predictor it actually works, but it doesn't use the image I built but rather the default one...

oborchers · 2021-08-30T07:28:49Z

Update: After figuring out how to work with the repository for sagemaker images I was able to get my problems fixed (which have been solely regarding the HuggingfaceModel not being able to load custom images or to run them: https://github.com/aws/deep-learning-containers

AliNGatGeeks · 2022-08-23T05:29:31Z

dummy version, but an annoying bug all the same

Hi
do you have a example for your work around?

johann-petrak · 2022-08-23T06:40:42Z

dummy version, but an annoying bug all the same

Hi do you have a example for your work around?

As far as I remember I just added the parameter framework_version="1.8.1"

I can't believe that this issue is still open. The way how AWS issues get ignored by Amazon developers is rather disappointing.

AliNGatGeeks · 2022-08-23T10:29:47Z

dummy version, but an annoying bug all the same

Hi do you have a example for your work around?

As far as I remember I just added the parameter framework_version="1.8.1"

I can't believe that this issue is still open. The way how AWS issues get ignored by Amazon developers is rather disappointing.

Thanks this seems to work for me as well....
I hope they fix it soon 😄

oborchers · 2022-08-23T12:35:14Z

I hope they fix it soon 😄

Me looking at my inbox and laughing frenetically: No.

Michael-Bar · 2022-12-02T13:37:14Z

Does your framework_version="1.8.1" solution definitely call the image from image_uri rather than fetching a different image via the framework_version arg?

eunseoada · 2023-11-03T15:32:31Z

@Michael-Bar
i have same question.
did you solved this problem?

Does your framework_version="1.8.1" solution definitely call the image from image_uri rather than fetching a different image via the framework_version arg?

jjerphan · 2023-11-23T11:11:34Z

Hi all,

#3188 partially has addressed the problem.

Still, some ambiguity remains for the specification of Models if py_version, framework_version and image_uri are all passed.

martinRenou · 2023-12-15T10:45:20Z

Closing as fixed by #3188

ahsan-z-khan added the type: bug label Mar 16, 2021

martinRenou assigned jjerphan Nov 17, 2023

martinRenou closed this as completed Dec 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specifying image_uri in PyTorchModel gives TypeError when running deploy #2202

Specifying image_uri in PyTorchModel gives TypeError when running deploy #2202

lc-billyfung commented Mar 10, 2021

purplexed commented May 21, 2021

johann-petrak commented Jun 21, 2021

zorrofox commented Aug 24, 2021

oborchers commented Aug 27, 2021

oborchers commented Aug 30, 2021

AliNGatGeeks commented Aug 23, 2022

johann-petrak commented Aug 23, 2022

AliNGatGeeks commented Aug 23, 2022

oborchers commented Aug 23, 2022

Michael-Bar commented Dec 2, 2022

eunseoada commented Nov 3, 2023 •

edited

Loading

jjerphan commented Nov 23, 2023

martinRenou commented Dec 15, 2023

Specifying image_uri in PyTorchModel gives TypeError when running deploy #2202

Specifying image_uri in PyTorchModel gives TypeError when running deploy #2202

Comments

lc-billyfung commented Mar 10, 2021

purplexed commented May 21, 2021

johann-petrak commented Jun 21, 2021

zorrofox commented Aug 24, 2021

oborchers commented Aug 27, 2021

oborchers commented Aug 30, 2021

AliNGatGeeks commented Aug 23, 2022

johann-petrak commented Aug 23, 2022

AliNGatGeeks commented Aug 23, 2022

oborchers commented Aug 23, 2022

Michael-Bar commented Dec 2, 2022

eunseoada commented Nov 3, 2023 • edited Loading

jjerphan commented Nov 23, 2023

martinRenou commented Dec 15, 2023

eunseoada commented Nov 3, 2023 •

edited

Loading