Improve error logging when invoking custom handler methods #164
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #163
Description of changes:
Improve debuggability during model load and inference failures caused by custom handler method implementation.
This is done by logging the exception traceback in addition to sending the traceback in the response back to client. Although this trackback is sent back to the client in the response body, the client may sometimes fail to load entire response body, for ex:
botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary and could not load the entire response body. See https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-pytorch-serving-**********-**** in account ************ for more information.
Testing:
Using a custom handler with expected error:
On deploying and making inference request, the cloudwatch logs contain the following log line:
2024-03-15T00:54:26,721 [INFO ] W-9000-model_1.0-stdout MODEL_LOG - Transform failed for model: model. Error traceback: ['Traceback (most recent call last):', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 150, in transform', ' result = self._run_handler_function(', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 284, in _run_handler_function', ' result = func(*argv_context)', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 268, in _default_transform_fn', ' prediction = self._run_handler_function(self._predict_fn, *(data, model))', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 280, in _run_handler_function', ' result = func(*argv)', ' File "/opt/ml/model/code/custom_inference.py", line 52, in predict_fn', ' assert False', 'AssertionError']
Note that the traceback is printed as a list of strings instead of a multi line string because this can cause other log statements to get interleaved with the exception traceback.