-
Notifications
You must be signed in to change notification settings - Fork 1.2k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
502 bad gateway? #1485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Actually the 502 error trouble was there when running predictor.predict(test) before deploy. But my model performed well in my own machine and saved exactly the same way as https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ |
Hi @Xixiong-Guo, if you are following the example from https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ , it's likely that you've used the wrong For framework versions 1.11 and above, we've split the tensorflow container into training and serving. And for deploying the model, please use this class instead: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/serving.py#L121 |
Hi, @ChuyangDeng Thanks for your reply. I did encountered this problem, and after I found this reference (https://sagemaker.readthedocs.io/en/stable/using_tf.html#deploying-directly-from-model-artifacts). I've changed to
I guess this should not be the reason for that 502 error now? Thanks! |
Hi @Xixiong-Guo, How did you tar the model? When you tar your model, please make sure to use the $ ls -al 00000123 # version number (not model name) |
Hi @ChuyangDeng My tar.gz looks like: You mean the directory should like: Thanks. |
Yes, SageMaker expects model to be extracted directly under "opt/ml/<model_name>/" directory inside the container. The sagemaker-tensorflow-serving container will look for model version directly under "<model_name>/". So your tar structure should be: model.tar.gz\1\saved_model.pb |
I've been having the same issue after following the same examples. I've also checked me TAR and use the serving model. The cloudwatch log is as follows from the moment I invoke the endpoint until it goes back to the regular pinging. (I used
|
For me this ended up being an issue with the shape of the input. I was uploading an individual sample, but the endpoint expects a batch, so I needed to make my input one layer deeper (As described here). Could this be happening for you as well, @Xixiong-Guo? |
Hi @Sbrikky , did you encounter the same 502 issue? |
@Xixiong-Guo Yes, I had the exact same error in my notebook as you posted so I didn't bother posting it again.
This suggested that maybe there was something in the shape of the request. Why on earth it ends up throwing this as a 502, I have no clue. |
Hi @Sbrikky I got it. In your case, is there any difference in terms of the error info, when you tried both predict(input) and predict([input.tolist()])? |
When I use predict([input.tolist()]) it works and I get a prediction back. No 502. |
Hi @Xixiong-Guo , looks like you are using csv_seralizer and note here (https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py#L325) that the serializer will try to serialize your input row by row delimited by "," if you are using a python list: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/predictor.py#L363 |
Hi all, 502 Bad Gatewaynginx/1.16.1 |
For me this ended up being an issue with the directory structure of the saved model. So, Your tar structure should should be : |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Following https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/, I tried to save two different model (sentiment analysis and a simple regression model) trained by tensorflow+keras, and uploaded to Sagemaker, but encountered the same 502 error, which is seldom reported here or stackoverflow. Any thoughts?
Body_review = ','.join([str(val) for val in padded_pred]).encode('utf-8')
response = runtime.invoke_endpoint(EndpointName=predictor.endpoint,
ContentType = 'text/csv',
Body = Body_review)
An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (502) from model with message " <title>502 Bad Gateway</title>
502 Bad Gateway
nginx/1.16.1
I searched the CloudWatch as found attached:
2020/05/10 15:53:27 [error] 35#35: *187 connect() failed (111: Connection refused) while connecting to upstream, client: 10.32.0.1, server: , request: "POST /invocations HTTP/1.1", subrequest: "/v1/models/export:predict", upstream: "http://127.0.0.1:27001/v1/models/export:predict", host: "model.aws.local:8080"
I tried another regression model (trained outside sagemaker, saved and loaded to S3 and Sagemaker, following https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ )
Still the same issue when using the predictor:
from sagemaker.predictor import csv_serializer
predictor.content_type = 'text/csv'
predictor.serializer = csv_serializer
Y_pred = predictor.predict(test.tolist())
Error:
--------------------------------------------------------------------------- ModelError Traceback (most recent call last) in () 4 predictor.serializer = csv_serializer 5 ----> 6 Y_pred = predictor.predict(test) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model) 108 109 request_args = self._create_request_args(data, initial_args, target_model) --> 110 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args) 111 return self._handle_response(response) 112 ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs) 314 "%s() only accepts keyword arguments." % py_operation_name) 315 # The "self" in this scope is referring to the BaseClient. --> 316 return self._make_api_call(operation_name, kwargs) 317 318 _api_call.name = str(py_operation_name) ~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params) 624 error_code = parsed_response.get("Error", {}).get("Code") 625 error_class = self.exceptions.from_code(error_code) --> 626 raise error_class(parsed_response, operation_name) 627 else: 628 return parsed_response ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (502) from model with message " <title>502 Bad Gateway</title>
502 Bad Gateway
nginx/1.16.1 ".
The text was updated successfully, but these errors were encountered: