Add data_type to hyperparameters #54

iquintero · 2018-01-23T01:04:23Z

When we describe a training job the data type of the hyper parameters is
lost because we use a dict[str, str]. This adds a new optional field to
Hyperparameter so that we can convert back to the right datatypes at runtime.

When we describe a training job the data type of the hyper parameters is lost because we use a dict[str, str]. This adds a new optional field to Hyperparameter so that we can convert the datatypes at runtime.

instead of validating with isinstance() cast the hp value to the type it is meant to be. This enforces a "strongly typed" value. When we deserialize from the API string responses it becomes easier to deal with too.

My previous commit broke a couple of unit tests. This fixes them.

owen-t · 2018-01-23T01:10:46Z

src/sagemaker/amazon/factorization_machines.py

@@ -23,34 +23,34 @@ class FactorizationMachines(AmazonAlgorithmEstimatorBase):

    repo = 'factorization-machines:1'

-    num_factors = hp('num_factors', (gt(0), isint), 'An integer greater than zero')
+    num_factors = hp('num_factors', (gt(0), isint), 'An integer greater than zero', int)


This is good - we can probably also drop the is (e.g. isint) validation methods as well.

isint() isbool() etc are no longer used.

* add sagemaker cli (#32) * add sagemaker cli * remove unnecessary close * address PR comments * tidy up imports * fix imports, flake8 errors * improve help message for bucket-name * remove default role name * fix log-level and py3 tests, add copyright * update cli example scripts * Add documentation about BYO Models (#47) * Add test for BYO estimator using Factorization Machines algorithm as an example. (#50) * Support multi-part uploads (#45) * Update TensorFlow examples following API change (#44) * Add data_type to hyperparameters (#54) When we describe a training job the data type of the hyper parameters is lost because we use a dict[str, str]. This adds a new field to Hyperparameter so that we can convert the datatypes at runtime. instead of validating with isinstance(), we cast the hp value to the type it is meant to be. This enforces a "strongly typed" value. When we deserialize from the API string responses it becomes easier to deal with too.

* Add data_type to hyperparameters (aws#54) When we describe a training job the data type of the hyper parameters is lost because we use a dict[str, str]. This adds a new field to Hyperparameter so that we can convert the datatypes at runtime. instead of validating with isinstance(), we cast the hp value to the type it is meant to be. This enforces a "strongly typed" value. When we deserialize from the API string responses it becomes easier to deal with too. * Add wrapper for LDA. (aws#56) Update CHANGELOG and bump the version number. * Add support for async fit() (aws#59) when calling fit(wait=False) it will return immediately. The training job will carry on even if the process exits. by using attach() the estimator can be retrieved by providing the training job name. _prepare_init_params_from_job_description() is now a classmethod instead of being a static method. Each class is responsible to implement their specific logic to convert a training job description into arguments that can be passed to its own __init__() * Fix Estimator role expansion (aws#68) Instead of manually constructing the role ARN, use the IAM boto client to do it. This properly expands service-roles and regular roles. * Add FM and LDA to the documentation. (aws#66) * Fix description of an argument of sagemaker.session.train (aws#69) * Fix description of an argument of sagemaker.session.train 'input_config' should be an array which has channel objects. * Add a link to the botocore docs * Use 'list' instead of 'array' in the description * Add ntm algorithm with doc, unit tests, integ tests (aws#73) * JSON serializer: predictor.predict accepts dictionaries (aws#62) Add support for serializing python dictionaries to json Add prediction with dictionary in tf iris integ test * Fixing timeouts for PCA async integration test. (aws#78) Execute tf_cifar test without logs to eliminate delay to detect that job has finished. * Fixes in LinearLearner and unit tests addition. (aws#77) * Print out billable seconds after training completes (aws#30) * Added: print out billable seconds after training completes * Fixed: test_session.py to pass unit tests * Fixed: removed offending tzlocal() * Use sagemaker_timestamp when creating endpoint names in integration tests. (aws#81) * Support TensorFlow-1.5.0 and MXNet-1.0.0 (aws#82) * Update .gitignore to ignore pytest_cache. * Support TensorFlow-1.5.0 and MXNet-1.0.0 * Update and refactor tests. Add tests for fw_utils. * Fix typo. * Update changelog for 1.1.0 (aws#85)

Add data_type to hyperparameters

bd409b6

When we describe a training job the data type of the hyper parameters is lost because we use a dict[str, str]. This adds a new optional field to Hyperparameter so that we can convert the datatypes at runtime.

iquintero requested a review from owen-t January 23, 2018 01:04

Ignacio Quintero added 2 commits January 23, 2018 14:26

Enforce a HP Type when setting its value.

6834806

instead of validating with isinstance() cast the hp value to the type it is meant to be. This enforces a "strongly typed" value. When we deserialize from the API string responses it becomes easier to deal with too.

Fix broken unit tests

c50b06a

My previous commit broke a couple of unit tests. This fixes them.

iquintero force-pushed the datatype_hp branch from 8a6a0f8 to c50b06a Compare January 24, 2018 01:14

Merge branch 'master' into datatype_hp

5649310

owen-t previously approved these changes Jan 24, 2018

View reviewed changes

Remove unused validation functions

59ac007

isint() isbool() etc are no longer used.

iquintero dismissed owen-t’s stale review via 59ac007 January 24, 2018 18:41

owen-t approved these changes Jan 24, 2018

View reviewed changes

iquintero merged commit 54b3830 into aws:master Jan 24, 2018

iquintero mentioned this pull request Jan 24, 2018

S3 Algos #46

Closed

ragavvenkatesan mentioned this pull request Jan 30, 2018

sync (#63) #64

Closed

iquintero mentioned this pull request Feb 6, 2018

S3 Estimator and Image Classification #71

Closed

iquintero deleted the datatype_hp branch May 9, 2018 23:19

laurenyu pushed a commit to laurenyu/sagemaker-python-sdk that referenced this pull request May 31, 2018

Add list for ll prepare_for_training (aws#54)

c78c46a

apacker pushed a commit to apacker/sagemaker-python-sdk that referenced this pull request Nov 15, 2018

Remove absolute paths from MXNet / TF examples (aws#54)

7aadcba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add data_type to hyperparameters #54

Add data_type to hyperparameters #54

iquintero commented Jan 23, 2018

owen-t Jan 23, 2018

Add data_type to hyperparameters #54

Add data_type to hyperparameters #54

Conversation

iquintero commented Jan 23, 2018

owen-t Jan 23, 2018

Choose a reason for hiding this comment