You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
AbortedError (see above for traceback): All 10 retry attempts failed. The last failure: Unknown: AccessDenied: Access Denied #11 [[node save/MergeV2Checkpoints (defined at /usr/local/lib/python2.7/dist-packages/tf_container/trainer.py:73) = MergeV2Checkpoints[delete_old_dirs=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](save/MergeV2Checkpoints/checkpoint_prefixes, _arg_save/Const_0_1)]]
Graph was finalized.
Running local_init_op.
Done running local_init_op.
Saving checkpoints for 0 into s3://fc-uk-data/datalake/blue-tigers/saimadhu-test/model_sample_datasets/outputs/output_dir/bluetigers-sagemaker-keras-20190220-5/checkpoints/model.ckpt.
The text was updated successfully, but these errors were encountered:
Looking at the error logs, it seems that TF is failing to write TF checkpoints to the S3 bucket . s3://fc-uk-data/datalake/blue-tigers/saimadhu-test/model_sample_datasets/outputs/output_dir/bluetigers-sagemaker-keras-20190220-5/checkpoints/model.ckpt.
Please fill out the form below.
System Information
Describe the problem
Describe the problem or feature request clearly here.
Minimal repro / logs
from sagemaker.tensorflow import TensorFlow
iris_estimator = TensorFlow(entry_point='keras_input.py',
role=role,
framework_version='1.12.0',
output_path=model_artifacts_location,
code_location=custom_code_upload_location,
train_instance_count=1,
train_instance_type='ml.m5.24xlarge',
hyperparameters={'learning_rate': 0.001},
training_steps=100,
evaluation_steps=2)
%%time
import boto3
s3://fc-uk-data/datalake/blue-tigers/saimadhu-test/model_sample_datasets/inputs/iris_data.csv
use the region-specific sample data bucket
region = boto3.Session().region_name
datalake/blue-tigers/green-spiders/data-analysis/modelling_data/xgboost_model_data_2018_oct_to_dec/train_data
s3://sagemaker-sample-data-eu-west-1/tensorflow/iris
train_data='datalake/blue-tigers/green-spiders/data-analysis/modelling_data/xgboost_model_data_2018_oct_to_dec/train_data'
train_data_location = 's3://{}/{}'.format(bucket,train_data)
iris_estimator.fit(train_data_location,job_name='bluetigers-sagemaker-keras-20190220-8',run_tensorboard_locally=True)
Error Message 👍
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
AbortedError (see above for traceback): All 10 retry attempts failed. The last failure: Unknown: AccessDenied: Access Denied
#11 [[node save/MergeV2Checkpoints (defined at /usr/local/lib/python2.7/dist-packages/tf_container/trainer.py:73) = MergeV2Checkpoints[delete_old_dirs=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](save/MergeV2Checkpoints/checkpoint_prefixes, _arg_save/Const_0_1)]]
Graph was finalized.
Running local_init_op.
Done running local_init_op.
Saving checkpoints for 0 into s3://fc-uk-data/datalake/blue-tigers/saimadhu-test/model_sample_datasets/outputs/output_dir/bluetigers-sagemaker-keras-20190220-5/checkpoints/model.ckpt.
The text was updated successfully, but these errors were encountered: