-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Deploying Pytorch models with elastic inference #1360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, @thomas-beznik, thanks for using SageMaker. We are currently working on the PyTorch EI documentations and more detailed examples are coming soon! At the same time, to answer your question:
|
Hello, Thank you for your help! I am now running into another issue. I have already trained my model outside of AWS and I have a TorchScript version of it. I would like to deploy it to an instance with elastic inference. I am thus using a PyTorchModel. When I deploy it I get the following error: ValueError: pytorch-serving is not supported with Amazon Elastic Inference. Currently only Python-based TensorFlow and MXNet are supported. But in the tutorial, they are able to deploy with elastic inference. They do it using the PyTorch class. The problem is that my model is already trained and I thus can't use this class. My questions are the following:
Thank you for the help! |
Hi @thomas-beznik, are you using the latest version of Python SDK? Our current error message looks like this:
Please make sure you are using the version 1.51.0 and above to use PyTorch EIA. |
Thank you @ChuyangDeng for the help so far, I was now able to deploy my model! But the journey is not yet over... I am now running into problems when trying to perform inference on the deployed model: when running the command
But when looking in the logs of my endpoint I can't see any error there...
Thank you very much for your help! Best, |
Ah it seems to be a size problem: when using a numpy array of size (1, 1, 96, 96, 96) I get the above error, but when I use an array of size (1, 1, 10, 10, 10) it doesn't give me the error anymore, but it gives me another error:
I believe that this error occurs in the Thanks! |
CUDA requires a GPU, but EI works only for CPU. I've forwarded this onto the team that owns PyTorch + EI to see if they have any insight. thanks for your patience! |
It looks like your model was saved while it was sent to CUDA device, so you'll need to provide an implementation of The model is first loaded on the host instance which has a CPU-only version of the framework. This model is then sent over the network to the server, which has GPU context enabled and we move your model tensors to CUDA at that time. So your model inference happens with CUDA but your model needs to be loaded initially to CPU. |
Yes indeed this solved it! But I am now getting a new error when running the prediction . Any help on that would be great! |
glad to hear we're making progress. going to close this issue and continue the conversation on #1370 |
Hello,
I am trying to deploy a Pytorch model on sagemaker using Elastic Inference. I have trouble finding the information I want in the documentation.
In this page https://sagemaker.readthedocs.io/en/stable/using_pytorch.html#deploy-pytorch-models, it is said that "if you are using PyTorch Elastic Inference, you do not have to provide a model_fn since the PyTorch serving container has a default one for you". Do we have to use this default model_fn, or can we use our own? Do we have to use a TorchScript model or not?
It would also be great to have a full example of how to deploy a Pytorch model trained outside of AWS.
Thanks!
The text was updated successfully, but these errors were encountered: