Skip to content

Pre-built Docker image does not exist for TensorFlow Frameworks 2+ #1406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
keelerh opened this issue Apr 13, 2020 · 2 comments
Closed

Pre-built Docker image does not exist for TensorFlow Frameworks 2+ #1406

keelerh opened this issue Apr 13, 2020 · 2 comments

Comments

@keelerh
Copy link

keelerh commented Apr 13, 2020

Describe the bug
When following the sample notebook referred to in the Deploy trained Keras or TensorFlow models using Amazon SageMaker blog post and specifying framework_version and 2.1.0 when defining TensorFlowModel I receive an UnexpectedStatusException that the Docker image does not exist.

To reproduce
Deploy a pre-trained TF model by following the steps in Deploy trained Keras or TensorFlow models using Amazon SageMaker.

At Step 5, there is a line specifying

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '1.12,
                                  entry_point = 'train.py')

I substitute this for

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py')

and get

UnexpectedStatusException: Error hosting endpoint sagemaker-tensorflow-2020-04-13-14-02-35-992: Failed. Reason:  The image '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py2' does not exist.

I get the same image does not exist error for all of the following configurations

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  py_version = 'py3')
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py2'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  py_version = 'py3'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py2'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py3'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-gpu-py2'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-gpu-py3'
)

Expected behavior
I expected there to be prebuilt Docker images in the public AWS ECR for account ID 520713654638 following the format sagemaker-tensorflow:<tensorflow_version>-<processor>-<python_version> for all supported versions of TensorFlow, which the documentation indicates includes 2.1.0.

System information
A description of your system. Please provide:

  • Kernel: conda_tensorflow_p36
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): TensorFlow
  • Framework version: 2.1
  • Python version: 2 and 3 (bug appears for both)
  • CPU or GPU: CPU and GPU (bug appears for both)
  • Custom Docker image (Y/N): N
@chuyang-deng
Copy link
Contributor

chuyang-deng commented Apr 14, 2020

Hi @keelerh, thanks for using SageMaker. For newer version of TensorFlow, please use this Model class: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/serving.py#L121

It will map to a different account and image tag for version 1.3.0 and above

@quocdat32461997
Copy link

I got the same issue. Solved by using from agemaker.tensorflow.serving import Model. Now replace TensorFlowModel with Model. Follow the guide here: https://sagemaker.readthedocs.io/en/stable/using_tf.html#deploying-directly-from-model-artifacts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants