Skip to content

Add hyperparameter tuning support #207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jun 5, 2018

Conversation

laurenyu
Copy link
Contributor

Description of changes:
Add support for hyperparameter tuning jobs.

This introduces a few key features:

  • a new class, HyperparameterTuner, which looks/acts like an estimator with fit(), deploy(), and attach() except that it creates hyperparameter tuning jobs instead of regular training jobs
  • a new method for estimators, _prepare_for_training(), which should set all values needed before training
  • new analytics classes for training and hyperparameter tuning jobs

This PR also bumps the SDK version to 1.4.0.

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have updated the changelog with a description of my changes (if appropriate)
  • I have updated any necessary documentation (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@laurenyu laurenyu requested a review from owen-t May 31, 2018 18:36
owen-t
owen-t previously approved these changes May 31, 2018
Copy link
Contributor

@owen-t owen-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thing - but happy to have this fixed post-release.

based on the training image name and current timestamp.
**kwargs: Other arguments
"""
if isinstance(inputs, list) or isinstance(inputs, RecordSet):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better:

kwargs = dict(kwargs)
kwargs['job_name'] = job_name
self._prepare_for_training(**kwargs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically replace line 140 to 145 with that block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_prepare_for_training() still needs records for 1P estimators but not for the others, though. Instead, it'd end up looking like:

        kwargs = dict(kwargs)
        kwargs['job_name'] = job_name
        if isinstance(inputs, list) or isinstance(inputs, RecordSet):
            kwargs['records'] = inputs
        self.estimator._prepare_for_training(**kwargs)

@codecov-io
Copy link

codecov-io commented Jun 4, 2018

Codecov Report

Merging #207 into master will increase coverage by 0.73%.
The diff coverage is 93.58%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #207      +/-   ##
==========================================
+ Coverage   90.76%   91.49%   +0.73%     
==========================================
  Files          42       45       +3     
  Lines        2717     3162     +445     
==========================================
+ Hits         2466     2893     +427     
- Misses        251      269      +18
Impacted Files Coverage Δ
src/sagemaker/amazon/hyperparameter.py 97.22% <ø> (ø) ⬆️
src/sagemaker/amazon/kmeans.py 100% <100%> (ø) ⬆️
src/sagemaker/amazon/ntm.py 100% <100%> (ø) ⬆️
src/sagemaker/amazon/lda.py 100% <100%> (ø) ⬆️
src/sagemaker/__init__.py 100% <100%> (ø) ⬆️
src/sagemaker/utils.py 90.9% <100%> (+3.03%) ⬆️
src/sagemaker/amazon/randomcutforest.py 100% <100%> (ø) ⬆️
src/sagemaker/amazon/linear_learner.py 100% <100%> (ø) ⬆️
src/sagemaker/amazon/pca.py 100% <100%> (ø) ⬆️
src/sagemaker/session.py 86.7% <78.57%> (-1.14%) ⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 731641c...2c16ae8. Read the comment docs.

@ChoiByungWook ChoiByungWook force-pushed the hyperparameter-tuning-support branch from a7576b4 to 2c16ae8 Compare June 4, 2018 23:14
@ChoiByungWook ChoiByungWook merged commit 502d6eb into aws:master Jun 5, 2018
@laurenyu laurenyu deleted the hyperparameter-tuning-support branch June 6, 2018 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants