-
Notifications
You must be signed in to change notification settings - Fork 1.2k
How to use Sagemaker to create an Inference Pipeline that tokenizes and converts text to word indices? #520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi, thanks for using Amazon SageMaker. For your use-case, the example that you looked is indeed the right approach. Continue to use SparkIn order to modify the same example for your use-case, you need to use Take a look at the notebook mentioned below. Though the task is not related to text analytics, the dataset has categorical columns and the example uses the two feature processors I mentioned above. Use Scikit-learn instead of SparkHowever, if you want to use Scikit-learn instead of Spark, you can use that as well. Here is a notebook on how to use Scikit-learn for an Inference Pipeline: Again, you need to use Scikit-learn And you can indeed use Scikit-learn with SageMaker Tensorflow in an Inference Pipeline setup. |
closing due to inactivity. feel free to reopen if necessary. |
In my use case, I have to feed a sparse matrix to an algorithm which is already deployed. I have used one hot encoder to convert an alphanumeric string into a sparse matrix as per my use-case to train the model. But, once the model is deployed, I can't use that one hot encoder inside the model. So, I tried to use a pipeline to process the input before feeding it into the deployed model, For which I have to deploy another model whose endpoint would act as a feed for my previous model. How should I achieve this? I have already tried to make a preprocessing model but after deploying the model, I am unable to get the desired result. I need my preprocessing model to return a sparse matrix but I am getting error 500. Is there another way to tackle this? Internal Server ErrorThe server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application. |
Co-authored-by: Aaron Markham <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.
I have a Sagemaker endpoint that is a tensorflow serving which expects as input a serialized TF Example.
Now I need to convert a string into a sequence of word indices by doing a lookup against a txt file, convert it into a serialised TF Example and pass it to the endpoint above.
Is Sagemaker the best tool to use to build the above preprocessing step? The closest example I could find is the inference_pipeline_sparkml_blazingtext_dbpedia notebook but this notebook does not show how to convert the tokenized text into word indices.
I was thinking of using scikit learn to do the feature transformation but am unsure how to do the word index lookup and whether the Scikit estimator in Sagemaker will allow me to call tensorflow functions to create the TF Example.
The text was updated successfully, but these errors were encountered: