Skip to content

Add wrapper for FactorizationMachiones algorithm. #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 15, 2018
Merged

Conversation

lukmis
Copy link
Contributor

@lukmis lukmis commented Jan 9, 2018

No description provided.

lukmis added 3 commits January 8, 2018 15:17
Require default_mini_batch_size during object creation for algorithms that require mini_batch_size.
Use default_mini_batch_size instead of hardcoded value.
Update documentation in the class.
Update integration tests for modifications.
Bump up the version.
Copy link
Contributor

@owen-t owen-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FM implementation is good - a few concerns about the default / required argument changes.

@@ -0,0 +1,29 @@
=========
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this! Should have done this from the get-go.

predictor_type = hp('predictor_type', isin('binary_classifier', 'regressor'),
'Value "binary_classifier" or "regressor"')
epochs = hp('epochs', (gt(0), isint), "An integer greater than 0")
clip_gradient = hp('clip_gradient', isfloat, "A float value")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This forces an explicit assignment, where an implicit assignment would be fine.

If I have a FacotrizationMachines object fm, then:

fm.clip_gradient = 55

will fail, because 55 is not float - even though it can be represented as a float. Consider using the isnumber check instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea - will change

@@ -33,8 +33,8 @@ class KMeans(AmazonAlgorithmEstimatorBase):
epochs = hp('epochs', (gt(0), isint), 'An integer greater-than 0')
center_factor = hp('extra_center_factor', (gt(0), isint), 'An integer greater-than 0')

def __init__(self, role, train_instance_count, train_instance_type, k, init_method=None,
max_iterations=None, tol=None, num_trials=None, local_init_method=None,
def __init__(self, role, train_instance_count, train_instance_type, k, default_mini_batch_size=5000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing another argument with default value is a little dangerous. If someone is calling this function will arguments set positionally, then this will fail.

That said, I think we can take this risk - the chance of this problem occuring is low and we're doing a new release. I'd recommend including constructor signature changes in the changelog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a required parameter and must be specified by the user. An alternative to constructor declaration will be fit enforcement. Let me rewrite it in that alternative form.

@@ -60,7 +60,7 @@ class LinearLearner(AmazonAlgorithmEstimatorBase):
unbias_label = hp('unbias_label', isbool, 'A boolean')
num_point_for_scalar = hp('num_point_for_scalar', (isint, gt(0)), 'An integer greater-than 0')

def __init__(self, role, train_instance_count, train_instance_type, predictor_type='binary_classifier',
def __init__(self, role, train_instance_count, train_instance_type, predictor_type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move this to a required argument? This will break clients working with the default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a required argument and we shouldn't default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it is a minor release change we will not break existing calls and I'll leave it then.

@@ -31,7 +31,7 @@ class PCA(AmazonAlgorithmEstimatorBase):
extra_components = hp(name='extra_components', validate=lambda x: x >= 0 and isinstance(x, int),
validation_message="Value must be an integer greater than or equal to 0")

def __init__(self, role, train_instance_count, train_instance_type, num_components,
def __init__(self, role, train_instance_count, train_instance_type, num_components, default_mini_batch_size,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to try to avoid including new required arguments to function signatures. Why can't this have a default value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a required parameter that has no default. Instead of constructor let's enforce by fit() in new algorithms and I will leave default logic here to avoid breaking existing calls.

@@ -46,3 +46,4 @@ def validate(value):
isint = istype(int)
isbool = istype(bool)
isnumber = istype(numbers.Number) # noqa
isfloat = istype(float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed above, probably don't need this.

owen-t
owen-t previously approved these changes Jan 13, 2018
@lukmis lukmis merged commit 2e0ed8f into aws:master Jan 15, 2018
laurenyu added a commit to laurenyu/sagemaker-python-sdk that referenced this pull request May 31, 2018
apacker pushed a commit to apacker/sagemaker-python-sdk that referenced this pull request Nov 15, 2018
@ChoiByungWook ChoiByungWook mentioned this pull request Feb 21, 2019
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants