Skip to content

Correctly handle TooManyBuckets error_code in default_bucket method #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 12, 2018

Conversation

jalabort
Copy link
Contributor

At my organisation we have reached our maximum number of allowed buckets on s3 and as a direct consequence of that we started getting the following error when using the sagemaker-python-sdk:

INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): s3.amazonaws.com
Traceback (most recent call last):
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/bin/sagemaker", line 6, in <module>
    exec(compile(open(__file__).read(), __file__, 'exec'))
  File "/Users/joan.alabort/cvdev/beatrix/hudl_beatrix/cli/bin/sagemaker", line 6, in <module>
    fire.Fire(component=SageMakerRunner)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "/Users/joan.alabort/cvdev/beatrix/hudl_beatrix/sagemaker/base.py", line 314, in train
    self._train(source_local_path, dataset_s3_path, job_name)
  File "/Users/joan.alabort/cvdev/beatrix/hudl_beatrix/sagemaker/base.py", line 364, in _train
    run_tensorboard_locally=self._run_tensorboard_locally
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/sagemaker/tensorflow/estimator.py", line 162, in fit
    fit_super()
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/sagemaker/tensorflow/estimator.py", line 154, in fit_super
    super(TensorFlow, self).fit(inputs, wait, logs, job_name)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/sagemaker/estimator.py", line 498, in fit
    code_bucket = self.sagemaker_session.default_bucket()
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/sagemaker/session.py", line 159, in default_bucket
    s3.create_bucket(Bucket=default_bucket)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/boto3/resources/factory.py", line 520, in do_action
    response = action(self, *args, **kwargs)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(**params)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/botocore/client.py", line 317, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/joan.alabort/miniconda3/envs/beatrix_27/lib/python2.7/site-packages/botocore/client.py", line 615, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (TooManyBuckets) when calling the CreateBucket operation: You have attempted to create more buckets than allowed

This little patch seems to take care of this issue.

I could not find much information in terms of the expected PR structure so I hope this suffices, happy to make the pertinent changes if it does not.

Correctly handles the `TooManyBuckets` error_code when the bucket already exists.
@jalabort jalabort changed the title Correctly handle the TooManyBuckets error_code in default_bucket Correctly handle TooManyBuckets error_code in default_bucket method Jan 10, 2018
@jalabort jalabort changed the title Correctly handle TooManyBuckets error_code in default_bucket method Correctly handle TooManyBuckets error_code in default_bucket method Jan 10, 2018
owen-t
owen-t previously approved these changes Jan 11, 2018
Copy link
Contributor

@owen-t owen-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your submission!

This looks good to me.

elif error_code == 'TooManyBuckets':
try:
s3.meta.client.head_bucket(Bucket=default_bucket)
LOGGER.info('S3 bucket {} already exists'.format(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this logging message, in the normal case we don't log this.

@@ -170,6 +170,14 @@ def default_bucket(self):
elif error_code == 'OperationAborted' and 'conflicting conditional operation' in message:
# If this bucket is already being concurrently created, we don't need to create it again.
pass
elif error_code == 'TooManyBuckets':
try:
s3.meta.client.head_bucket(Bucket=default_bucket)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little esoteric - if we remove the logging below, can you add some developer documentation:

# Succeed if the default bucket exists

Remove logging message
@jalabort
Copy link
Contributor Author

jalabort commented Jan 11, 2018

@owen-t Addressed your comments above.

@jalabort
Copy link
Contributor Author

Is there something I need to amend due to the failing teamcity build? I cannot see what's wrong with it.

@ChoiByungWook
Copy link
Contributor

Hello,
We don't publish our build logs externally. A test failed, but the issue is resolved and we're rerunning your build. We will merge it in once the build is complete. Thanks again for your contribution!

@owen-t owen-t merged commit f43b9c3 into aws:master Jan 12, 2018
apacker pushed a commit to apacker/sagemaker-python-sdk that referenced this pull request Nov 15, 2018
removing references to auto-scaling
athewsey added a commit to athewsey/sagemaker-python-sdk that referenced this pull request Jul 16, 2021
Merge pull request aws#40 from athewsey/feat/fw-processor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants