Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

HOW TO: train on Sagemaker and deploy on Local? #1164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Dgaylard opened this issue Dec 12, 2019 · 5 comments
Closed

HOW TO: train on Sagemaker and deploy on Local? #1164

Dgaylard opened this issue Dec 12, 2019 · 5 comments

Comments

@Dgaylard
Copy link

Please fill out the form below.

System Information

  • **Framework (Sagemaker?) / Algorithm (KMeans)
  • Python Version: 3/7
  • CPU or GPU: Trained on GPU's using RecordIO input format.
  • Python SDK Version: Boto3?
  • Are you using a custom image: AWS Kmeans algo

Describe the problem

Hey there, after trying to get this running for awhile now I'm here.

Basically I have trained Kmeans models on Sagemaker (Great!) However I now want to deploy them locally.

Now I have the standard Kmeans artefacts output, a model.gz file that I:

Download from my S3:

s3_client.download_file('mybucket', myfile/path/model.tar.gz',
                                 '/tmp/model.tar.gz')

Extract:
os.system('tar -zxvf model.tar.gz')

Now I want to just take the extracted items (model_algo-1, state_ac4243fa-9838-41d2-b8d0-29601c73fdc3) and load them into a Kmeans object so I can actually infer with it locally.

I understand the primary of serving these models is through sagemaker but it doesn't seem much to ask that I can deploy it locally as a part of a much larger object?

Any help would be great, currently I've just gotten the following:

import mxnet as mx
eudexCluster = mx.ndarray.load('/tmp/model_algo-1')
cluster_centroids=pd.DataFrame(eudexCluster[0].asnumpy())

In addition

What exactly is this state_ac4243fa-9838-41d2-b8d0-29601c73fdc3 file? Seems to me like it's a checkpoint of somekind?

@Dgaylard
Copy link
Author

So I've ran into two posts that briefly cover this:

aws/amazon-sagemaker-examples#294 by @djarpin

and https://stackoverflow.com/questions/8193563/predicting-values-with-k-means-clustering-algorithm

Both are pretty miserable, it would look like one would have to manually compare new inputs and calculate the centroid cluster. Okay, lets say that that is the only way to deploy this, then how do we know what other members are within each of those centroids?

For example my ND array is K=75000 and n_features = 5000, 75000x5000 and vectorize an input then calculate that it sits in K5500 cluster, how would I then know what other training examples sit in there? Would I have to then reparse my entire training set to see where each variable sits, create a table that holds those values and compare back to that talble when making predictions?

Boy T_T

@mhaboali
Copy link

We have the same question here as well. We're looking forward to hearing soon how we can deploy a SageMaker-trained model onto our local machines.

Thanks!

@nadiaya
Copy link
Contributor

nadiaya commented Dec 17, 2019

Hi, thank you for using SageMaker.
I have forwarded your question to the appropriate team who owns KMeans algorithm and containers.

@anthonywebb
Copy link

anthonywebb commented Mar 24, 2020

@mhaboali did you find a way to do this? Looking for the same thing myself. beginning to wonder if this is at all possible?

@mhaboali
Copy link

@anthonywebb

Actually, I'd not continued on that approach but the comments under this issue might be useful for this purpose here

Thanks!
Mohamed

@aws aws locked and limited conversation to collaborators May 20, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

6 participants