Skip to content

Bug: Cannot use spot instances with hyperparameter tuning job #1011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adrian-chang opened this issue Aug 29, 2019 · 2 comments
Closed

Bug: Cannot use spot instances with hyperparameter tuning job #1011

adrian-chang opened this issue Aug 29, 2019 · 2 comments
Labels
status: pending release The fix have been merged but not yet released to PyPI type: bug

Comments

@adrian-chang
Copy link

adrian-chang commented Aug 29, 2019

Please fill out the form below.

System Information

  • Python Version:
    3.6.9
  • Python SDK Version:
    1.38.3

Describe the problem

Not possible to use spot instances with the hyperparameter tuning job.

Minimal repro / logs

https://github.com/aws/sagemaker-python-sdk/blob/9765de68ad8b776740d800148c861ca0e4794716/src/sagemaker/job.py

Doesn't appear at all to copy over the train_use_spot_instances attribute of an estimator yet even though train_max_wait is used when set. This is problematic as you cannot use spot instances for hyperparameter tuning even though you can for individual tuning to the point where it's an issue if train_max_wait is set.

File "/lib/python3.6/site-packages/sagemaker/tuner.py", line 362, in fit
self.latest_tuning_job = _TuningJob.start_new(self, inputs)
File "/lib/python3.6/site-packages/sagemaker/tuner.py", line 893, in start_new
tuner.estimator.sagemaker_session.tune(**tuner_args)
File "/lib/python3.6/site-packages/sagemaker/session.py", line 574, in tune
self.sagemaker_client.create_hyper_parameter_tuning_job(**tune_request)
File "/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/lib/python3.6/site-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the CreateHyperParameterTuningJob operation: Invalid MaxWaitTimeInSeconds. It is only supported when EnableManagedSpotTraining is set to true

@adrian-chang adrian-chang changed the title Cannot use spot instances with hyperparameter tuning job Bug: Cannot use spot instances with hyperparameter tuning job Aug 29, 2019
@speg03
Copy link
Contributor

speg03 commented Sep 1, 2019

I also encountered the same problem. I will send a pull request later.

@laurenyu laurenyu added the status: pending release The fix have been merged but not yet released to PyPI label Sep 5, 2019
@laurenyu
Copy link
Contributor

laurenyu commented Sep 9, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: pending release The fix have been merged but not yet released to PyPI type: bug
Projects
None yet
Development

No branches or pull requests

4 participants