Improve error logging when invoking custom handler methods #164

namannandan · 2024-03-13T21:01:36Z

Issue #163

Description of changes:
Improve debuggability during model load and inference failures caused by custom handler method implementation.
This is done by logging the exception traceback in addition to sending the traceback in the response back to client. Although this trackback is sent back to the client in the response body, the client may sometimes fail to load entire response body, for ex:
botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary and could not load the entire response body. See https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-pytorch-serving-**********-**** in account ************ for more information.

Testing:
Using a custom handler with expected error:

.....
.....
def predict_fn(input_data, model_pack):

    print("predict_fn got input Data: {}".format(input_data))
    model = model_pack[0]
    tokenizer = model_pack[1]
    mapping_file_path = model_pack[2]

    with open(mapping_file_path) as f:
        mapping = json.load(f)

    assert False

    inputs = tokenizer.encode_plus(
        input_data,
        max_length=128,
        pad_to_max_length=True,
        add_special_tokens=True,
        return_tensors="pt",
    )
.....
.....

On deploying and making inference request, the cloudwatch logs contain the following log line:
2024-03-15T00:54:26,721 [INFO ] W-9000-model_1.0-stdout MODEL_LOG - Transform failed for model: model. Error traceback: ['Traceback (most recent call last):', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 150, in transform', ' result = self._run_handler_function(', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 284, in _run_handler_function', ' result = func(*argv_context)', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 268, in _default_transform_fn', ' prediction = self._run_handler_function(self._predict_fn, *(data, model))', ' File "/sagemaker-pytorch-inference-toolkit/src/sagemaker_inference/transformer.py", line 280, in _run_handler_function', ' result = func(*argv)', ' File "/opt/ml/model/code/custom_inference.py", line 52, in predict_fn', ' assert False', 'AssertionError']

Note that the traceback is printed as a list of strings instead of a multi line string because this can cause other log statements to get interleaved with the exception traceback.

src/sagemaker_inference/transformer.py

namannandan added 3 commits March 13, 2024 13:56

Improve error logging when invoking custom handler methods

1589884

Update format for error logging

2233414

Print traceback on a single line

fa4239d

namannandan requested review from nikhil-sk and lxning March 14, 2024 18:26

nikhil-sk reviewed Mar 14, 2024

View reviewed changes

src/sagemaker_inference/transformer.py Outdated Show resolved Hide resolved

Refactor error logging

88d7eb5

namannandan force-pushed the improve-error-logging branch from 10375dc to 88d7eb5 Compare March 15, 2024 00:44

nikhil-sk previously approved these changes Mar 15, 2024

View reviewed changes

Fix flake8 errors

ee19854

namannandan dismissed nikhil-sk’s stale review via ee19854 March 15, 2024 19:19

namannandan requested a review from nikhil-sk March 15, 2024 22:53

nikhil-sk approved these changes Mar 15, 2024

View reviewed changes

namannandan merged commit d49082e into aws:master Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve error logging when invoking custom handler methods #164

Improve error logging when invoking custom handler methods #164

Uh oh!

namannandan commented Mar 13, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Improve error logging when invoking custom handler methods #164

Improve error logging when invoking custom handler methods #164

Uh oh!

Conversation

namannandan commented Mar 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

namannandan commented Mar 13, 2024 •

edited

Loading