Skip to content

Trying to update PATH or PYTHONPATH env vars for a PyTorchModel endpoint results in UnexpectedStatusException #1857

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
setu4993 opened this issue Aug 26, 2020 · 1 comment

Comments

@setu4993
Copy link

Describe the bug
Creating a PyTorchModel-based endpoint that attempts to update the PATH or PYTHONPATH environment variables results in a UnexpectedStatusException.

To reproduce

Create any PyTorchModel object and add: env={"PATH": "/opt/ml/model/code/lib:${PATH}"} or env={"PYTHONPATH": "/opt/ml/model/code/lib:${PYTHONPATH}"} during model definition.

Expected behavior

The model should be created and endpoint deployed successfully.

Screenshots or logs
Traceback:

Error traceback:
------------------
UnexpectedStatusException                 Traceback (most recent call last)
<ipython-input-15-a29e7afc3eb6> in <module>
----> 1 pt_model.deploy(initial_instance_count=1, instance_type="ml.t2.medium", endpoint_name="coach-notes-test-2020-08-25")

/opt/conda/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config)
    508             kms_key=kms_key,
    509             wait=wait,
--> 510             data_capture_config_dict=data_capture_config_dict,
    511         )
    512 

/opt/conda/lib/python3.6/site-packages/sagemaker/session.py in endpoint_from_production_variants(self, name, production_variants, tags, kms_key, wait, data_capture_config_dict)
   3095 
   3096             self.sagemaker_client.create_endpoint_config(**config_options)
-> 3097         return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait)
   3098 
   3099     def expand_role(self, role):

/opt/conda/lib/python3.6/site-packages/sagemaker/session.py in create_endpoint(self, endpoint_name, config_name, tags, wait)
   2615         )
   2616         if wait:
-> 2617             self.wait_for_endpoint(endpoint_name)
   2618         return endpoint_name
   2619 

/opt/conda/lib/python3.6/site-packages/sagemaker/session.py in wait_for_endpoint(self, endpoint, poll)
   2884                 ),
   2885                 allowed_statuses=["InService"],
-> 2886                 actual_status=status,
   2887             )
   2888         return desc

UnexpectedStatusException: Error hosting endpoint coach-notes-test-2020-08-25: Failed. Reason: Request to service failed. If failure persists after retry, contact customer support..

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.5.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): PyTorch
  • Framework version: 1.5.0
  • Python version: 3.6
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): No

Additional context

Updating PATH is required because of the bug mentioned in #1832.

@setu4993
Copy link
Author

I was able to resolve this by setting PYTHONPATH without trying to use the existing env var PYTHONPATH. So, the env var definition would be: env={"PYTHONPATH": "/opt/ml/model/code/lib"}.

And then the container build step takes care of adding in the other env vars that are required. The final PYTHONPATH in the container, then is: '/.sagemaker/mms/models/model/code:sagemaker_pytorch_serving_container.handler_service:/opt/ml/model/code:/opt/ml/model/code/lib:/.sagemaker/mms/models/model'.

Ideally, this shouldn't have been breaking but I'm fine with closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant