Skip to content

How pass a name to models saved with sagemaker.sklearn.estimator.SKLearn #785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
plynch-chwy opened this issue May 8, 2019 · 5 comments
Closed

Comments

@plynch-chwy
Copy link

plynch-chwy commented May 8, 2019

Please fill out the form below.

System Information

  • Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): SKLearn
  • Framework Version: 0.20.0
  • Python Version: 3.6
  • CPU or GPU: CPU
  • Python SDK Version: 1.18.16.dev0
  • Are you using a custom image: No

Describe the problem

I am creating an SKLearn estimator and specifying a base_job_name, but my saved models are being saved as sagemaker-scikit-learn-timestamp.

from sagemaker.sklearn.estimator import SKLearn

...

my_estimator = SKLearn(
    entry_point='entry.py',
    source_dir='src/sagemaker/',
    hyperparameters=hyperparams,
    py_version='py3',
    framework_version='0.20.0',
    role=role,
    train_instance_type=instance_type,
    base_job_name='my-classifier',
    output_path=outputs,
    code_location=code_location)

my_estimator.fit(inputs=inputs)

An example saved model that is saved from this training job is then sagemaker-scikit-learn-2019-05-08-12-50-06-661.

@ericangelokim
Copy link
Contributor

Hello,

Just to clarify, your model file name should be determined from the entry point script (like so: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_iris/scikit_learn_iris.py#L60), but the job name should be created from the base_job_name. The code for this can be seen within the SDK:

Estimator _prepare_for_training: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/estimator.py#L175

SKLearn estimator: https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/sklearn/estimator.py (method not overriden)

Is the job name correct?

@plynch-chwy
Copy link
Author

plynch-chwy commented May 13, 2019

Sorry, not sure if I made it clear that I have 2 scripts. The first launches the training job and saves the model (shown above). The other then looks up the training job name and deploys the endpoint:

import boto3
from sagemaker.sklearn.estimator import SKLearn

...

job_name = wait_for_training('my-classifier')
my_estimator = SKLearn.attach(job_name)
predictor = my_estimator.deploy(initial_instance_count=1,
                                       instance_type=instance_type,
                                       endpoint_name='my-classifier',
                                       update_endpoint=endpoint_exists('my-classifier'))

The wait_for_training waits until training is complete and returns the training job name and endpoint_exists returns True if endpoint already exists so I can update it.

My training job name does get named properly as my-classifier-timestamp . I do save multiple objects besides the model saved to /opt/ml/model though (category feature mappings needed to processing inputs). The model file is saved as my-classifier.joblib, but the tar file gets saved as model.tar.gz in S3.

When I deploy the model with the 2nd script, the SageMaker model name is then named as sagemaker-scikit-learn-timestamp. Even though I'm using SKLearn, this is an issue since I will have multiple deployed models from SKLearn in the future, and would potentially like to find them by name using boto3.

@plynch-chwy
Copy link
Author

I've found that if I combine my 2 scripts to train and deploy, I don't have this problem:

my_estimator.fit(inputs=inputs, wait=True)
my_estimator.deploy(initial_instance_count=1)

I haven't tested further yet, but seems the model name is not getting attached and passed through properly.

@ericangelokim
Copy link
Contributor

Hello

https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/sklearn/estimator.py#L119

It looks like the problem is due to the Model creation setting the name as the _current_job_name instead of latest_job_name.name. The new estimator never set a current job due to not having run a training job, so the _current_job_name is None.

I will be driving a fix from the SDK side to fix this usecase. Thank you for your patience.

@ericangelokim
Copy link
Contributor

Hello, the fix was pushed into master; you should be able to attach the correct name now.

https://github.com/aws/sagemaker-python-sdk/pull/808/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants