error with hyperparameters tuning #224

guilloufre · 2018-06-11T10:19:35Z

Please fill out the form below.

System Information

Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Factorization Machine
Framework Version:
Python Version: 2.7
CPU or GPU: CPU
Python SDK Version: 1.4.1
Are you using a custom image:

Describe the problem

Hello,
I tried to use the newly released hyperparameters tuning described on the aws blog, but some error was thrown, when I launch the following command:
tuner.fit({'train': records_train, 'test': records_val})

This is the error:

Traceback (most recent call last):

  File "<ipython-input-20-9c3ba798ac1a>", line 1, in <module>
    tuner.fit({'train': s3_train_data, 'test': s3_val_data})

  File "/usr/local/lib/python2.7/dist-packages/sagemaker/tuner.py", line 144, in fit
    self.estimator._prepare_for_training(job_name)

  File "/usr/local/lib/python2.7/dist-packages/sagemaker/amazon/amazon_estimator.py", line 117, in _prepare_for_training
    feature_dim = records.feature_dim

AttributeError: 'NoneType' object has no attribute 'feature_dim'

Both records_train and records_val are RecordSet objects. For example, this is records_train:
(<class 'sagemaker.amazon.amazon_estimator.RecordSet'>, {'s3_data_type': 'S3Prefix', 'feature_dim': 2229, 'num_records': 7923, 'channel': 'train', 's3_data': 's3://###############'})

The training of the Factorization machine works if I launch:
fm_estimator.fit(records_train, mini_batch_size = 1000)
I also tried by providing direct links to s3 instead of RecordSet object with
tuner.fit({'train': s3_train_data, 'test': s3_val_data})
like in the example on the blog, but it throws the same error.

Thanks for helping me about this issue!

The text was updated successfully, but these errors were encountered:

laurenyu · 2018-06-11T16:13:59Z

hi @guilloufre, thanks for trying out the new hyperparameter tuning feature!

The error is because you need to pass a list with the RecordSet objects instead of a dict to fit(). (the channel names are already specified in the RecordSet class, so no need to write them again.)

guilloufre · 2018-06-14T03:21:07Z

Thanks, it was actually easy! :)

Fixed: payload was larger than SageMaker limit

phschimm · 2022-02-09T12:17:11Z

When using the record_set() method referenced here, the created RecordSet does not use the required S3DataDistributionType=FullyReplicated but instead uses ShardedByS3Key.

I'm trying to use the high level Python API to get metrics about my created models:

from sagemaker import RandomCutForest
from sagemaker.tuner import HyperparameterTuner, IntegerParameter

rcf = RandomCutForest(..., eval_metrics=['accuracy', 'precision_recall_fscore'])

train_set = rcf.record_set(features,
                           channel='train')

test_set = rcf.record_set(features,
                          labels=labels,
                          channel='test')

tuner = HyperparameterTuner(estimator=rcf,
                            objective_metric_name='test:f1',
                            hyperparameter_ranges={'num_samples_per_tree': IntegerParameter(32, 512),
                                                   'num_trees': IntegerParameter(50, 1000)},
                            max_jobs=1,
                            max_parallel_jobs=1)

tuner.fit([train_set, test_set])

When I execute this code, I get the following error in the AWS SageMaker console:

Failure reason
ClientError: Unable to initialize the algorithm. Failed to validate input data configuration. (caused by ValidationError) Caused by: 'ShardedByS3Key' is not one of ['FullyReplicated'] Failed validating 'enum' in schema['properties']['test']['properties']['S3DistributionType']: {'enum': ['FullyReplicated'], 'type': 'string'} On instance['test']['S3DistributionType']: 'ShardedByS3Key'

Is there a manual way to create a RecordSet that has the correct S3DataDistributionType via the Pyton API?

laurenyu mentioned this issue Jun 11, 2018

Add more detail to README for automatic model tuning #225

Merged

4 tasks

guilloufre closed this as completed Jun 14, 2018

apacker pushed a commit to apacker/sagemaker-python-sdk that referenced this issue Nov 15, 2018

Merge pull request aws#224 from awslabs/arpin_pca_mnist_payload_size

fb52f25

Fixed: payload was larger than SageMaker limit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

error with hyperparameters tuning #224

error with hyperparameters tuning #224

guilloufre commented Jun 11, 2018

laurenyu commented Jun 11, 2018

Uh oh!

guilloufre commented Jun 14, 2018

Uh oh!

phschimm commented Feb 9, 2022 •

edited

Loading

Uh oh!

error with hyperparameters tuning #224

error with hyperparameters tuning #224

Comments

guilloufre commented Jun 11, 2018

System Information

Describe the problem

laurenyu commented Jun 11, 2018

Uh oh!

guilloufre commented Jun 14, 2018

Uh oh!

phschimm commented Feb 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phschimm commented Feb 9, 2022 •

edited

Loading