Skip to content

Error saving the model artifact. #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
PedroCardoso opened this issue Nov 18, 2018 · 1 comment
Closed

Error saving the model artifact. #115

PedroCardoso opened this issue Nov 18, 2018 · 1 comment

Comments

@PedroCardoso
Copy link

Recently, Tensorflow image started to give have an error with the final part of the training while saving the model artifact :

2018-11-18 21:51:28,519 ERROR - tf_container - Failed to download saved model. File does not exist in s3://sagemaker.../.../models_checkpoint (removed real path)
2018-11-18 21:51:28,519 ERROR - container_support.training - uncaught exception during training: 'Contents'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/container_support/training.py", line 36, in start
    fw.train()
  File "/usr/local/lib/python2.7/dist-packages/tf_container/train_entry_point.py", line 177, in train
    serve.export_saved_model(checkpoint_dir, env.model_dir)
  File "/usr/local/lib/python2.7/dist-packages/tf_container/serve.py", line 54, in export_saved_model
    raise e
KeyError: 'Contents'

I do have the argument checkpoint_path in the function call, and even added it into the parameters. Looking at the logs, the actual checkpoint is being stored in a local temp folder, like the model:

2018-11-18 21:51:22,307 WARNING - tensorflow - Using temporary folder as model directory: /tmp/tmpB7cDbM
...
2018-11-18 21:51:28,447 INFO - tensorflow - SavedModel written to: /tmp/tmpB7cDbM/export/Servo/temp-1542577888/saved_model.pb

Why did it stop storing on S3, nothing on the code indicates me why this is doing this. Why the env variable is not defined.

@PedroCardoso
Copy link
Author

The problem was on my side. The config was correctly used on the Estimator definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant