Deploying Pytorch models with elastic inference #1360

thomas-beznik · 2020-03-16T16:31:37Z

Hello,

I am trying to deploy a Pytorch model on sagemaker using Elastic Inference. I have trouble finding the information I want in the documentation.

In this page https://sagemaker.readthedocs.io/en/stable/using_pytorch.html#deploy-pytorch-models, it is said that "if you are using PyTorch Elastic Inference, you do not have to provide a model_fn since the PyTorch serving container has a default one for you". Do we have to use this default model_fn, or can we use our own? Do we have to use a TorchScript model or not?

It would also be great to have a full example of how to deploy a Pytorch model trained outside of AWS.

Thanks!

chuyang-deng · 2020-03-16T22:00:06Z

Hi, @thomas-beznik, thanks for using SageMaker. We are currently working on the PyTorch EI documentations and more detailed examples are coming soon!

At the same time, to answer your question:

you do not "have to" use the default functions, you can use your own inference script
in your custom inference script, to trigger accelerator, you have to use a TorchScript model. You will need to use torch.jit.save to save your model, instead of saving it as a state dictionary; also in the predict_fn of your implementation, please use torch.jit.optimized_execution to load your output.

thomas-beznik · 2020-03-17T18:20:25Z

Hello,

Thank you for your help! I am now running into another issue. I have already trained my model outside of AWS and I have a TorchScript version of it. I would like to deploy it to an instance with elastic inference. I am thus using a PyTorchModel. When I deploy it I get the following error:

ValueError: pytorch-serving is not supported with Amazon Elastic Inference. Currently only Python-based TensorFlow and MXNet are supported.

But in the tutorial, they are able to deploy with elastic inference. They do it using the PyTorch class. The problem is that my model is already trained and I thus can't use this class. My questions are the following:

Is there a way to deploy a PyTorchModel with the accelerator?
Or is there a way to transform my model from PyTorchModel to PyTorch?

Thank you for the help!

chuyang-deng · 2020-03-17T19:25:59Z

Hi @thomas-beznik, are you using the latest version of Python SDK? Our current error message looks like this:

sagemaker-python-sdk/src/sagemaker/fw_utils.py

Line 272 in b8d5470

"{} is not supported with Amazon Elastic Inference. Currently only "

Please make sure you are using the version 1.51.0 and above to use PyTorch EIA.

thomas-beznik · 2020-03-19T14:20:08Z

Thank you @ChuyangDeng for the help so far, I was now able to deploy my model!

But the journey is not yet over... I am now running into problems when trying to perform inference on the deployed model: when running the command predictor.predict(input) (where input is a numpy array) I get the following error:

ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "https://runtime.sagemaker.eu-west-1.amazonaws.com/endpoints/pytorch-inference-.../invocations

But when looking in the logs of my endpoint I can't see any error there...

Do you know what could be the cause of this error?
Are there tutorials, or do you have any advice to debug this sort of situation and to get a better view of what is happening inside of the endpoint? I have used logging.info inside of my entry-point script of the endpoint, but I can't see anything written to the log file.

Thank you very much for your help!

Best,
Thomas

thomas-beznik · 2020-03-19T14:46:03Z

Ah it seems to be a size problem: when using a numpy array of size (1, 1, 96, 96, 96) I get the above error, but when I use an array of size (1, 1, 10, 10, 10) it doesn't give me the error anymore, but it gives me another error:

java.lang.IllegalArgumentException: reasonPhrase contains one of the following prohibited characters: \r\n: Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,--no-as-needed in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols. You can check if this has occurred by using ldd on your binary to see if there is a dependency on *_cuda.so library. (initCUDA at /opt/conda/conda-bld/pytorch_1573049306851/work/aten/src/ATen/detail/CUDAHooksInterface.h:63)

I believe that this error occurs in the model_fn method when creating the model with torch.jit.load.

Thanks!
Thomas

laurenyu · 2020-03-20T18:15:31Z

CUDA requires a GPU, but EI works only for CPU. I've forwarded this onto the team that owns PyTorch + EI to see if they have any insight. thanks for your patience!

dfan · 2020-03-20T18:25:06Z

It looks like your model was saved while it was sent to CUDA device, so you'll need to provide an implementation of model_fn that loads to CPU. torch.jit.load(model, map_location=torch.device('cpu')). This may be something we should clarify in the docs, I'll consult with the team.

The model is first loaded on the host instance which has a CPU-only version of the framework. This model is then sent over the network to the server, which has GPU context enabled and we move your model tensors to CUDA at that time. So your model inference happens with CUDA but your model needs to be loaded initially to CPU.

ThomasBeznik · 2020-03-23T08:26:46Z

Yes indeed this solved it! But I am now getting a new error when running the prediction . Any help on that would be great!

laurenyu · 2020-03-23T16:48:40Z

glad to hear we're making progress. going to close this issue and continue the conversation on #1370

chuyang-deng mentioned this issue Mar 16, 2020

doc: add more details about PyTorch eia #1357

Merged

7 tasks

laurenyu added the type: question label Mar 18, 2020

laurenyu added the Pending information label Mar 20, 2020

laurenyu removed the Pending information label Mar 20, 2020

laurenyu closed this as completed Mar 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploying Pytorch models with elastic inference #1360

Deploying Pytorch models with elastic inference #1360

thomas-beznik commented Mar 16, 2020 •

edited

Loading

chuyang-deng commented Mar 16, 2020

thomas-beznik commented Mar 17, 2020

chuyang-deng commented Mar 17, 2020

thomas-beznik commented Mar 19, 2020 •

edited

Loading

thomas-beznik commented Mar 19, 2020 •

edited

Loading

laurenyu commented Mar 20, 2020

dfan commented Mar 20, 2020

ThomasBeznik commented Mar 23, 2020

laurenyu commented Mar 23, 2020

Deploying Pytorch models with elastic inference #1360

Deploying Pytorch models with elastic inference #1360

Comments

thomas-beznik commented Mar 16, 2020 • edited Loading

chuyang-deng commented Mar 16, 2020

thomas-beznik commented Mar 17, 2020

chuyang-deng commented Mar 17, 2020

thomas-beznik commented Mar 19, 2020 • edited Loading

thomas-beznik commented Mar 19, 2020 • edited Loading

laurenyu commented Mar 20, 2020

dfan commented Mar 20, 2020

ThomasBeznik commented Mar 23, 2020

laurenyu commented Mar 23, 2020

thomas-beznik commented Mar 16, 2020 •

edited

Loading

thomas-beznik commented Mar 19, 2020 •

edited

Loading

thomas-beznik commented Mar 19, 2020 •

edited

Loading