diff --git a/doc/algorithms/index.rst b/doc/algorithms/index.rst index 45235a3bfe..bd78267d9b 100644 --- a/doc/algorithms/index.rst +++ b/doc/algorithms/index.rst @@ -1,5 +1,5 @@ ###################### -First-Party Algorithms +Built-in Algorithms ###################### Amazon SageMaker provides implementations of some common machine learning algorithms optimized for GPU architecture and massive datasets. @@ -7,14 +7,9 @@ Amazon SageMaker provides implementations of some common machine learning algori .. toctree:: :maxdepth: 2 - sagemaker.amazon.amazon_estimator - factorization_machines - ipinsights - kmeans - knn - lda - linear_learner - ntm - object2vec - pca - randomcutforest + tabular/index + text/index + time_series/index + unsupervised/index + vision/index + other/index diff --git a/doc/algorithms/other/index.rst b/doc/algorithms/other/index.rst new file mode 100644 index 0000000000..4bd9800221 --- /dev/null +++ b/doc/algorithms/other/index.rst @@ -0,0 +1,10 @@ +###################### +Other +###################### + +:ref:`All Pre-trained Models ` + +.. toctree:: + :maxdepth: 2 + + sagemaker.amazon.amazon_estimator diff --git a/doc/algorithms/sagemaker.amazon.amazon_estimator.rst b/doc/algorithms/other/sagemaker.amazon.amazon_estimator.rst similarity index 100% rename from doc/algorithms/sagemaker.amazon.amazon_estimator.rst rename to doc/algorithms/other/sagemaker.amazon.amazon_estimator.rst diff --git a/doc/algorithms/tabular/autogluon.rst b/doc/algorithms/tabular/autogluon.rst new file mode 100644 index 0000000000..8eae72187e --- /dev/null +++ b/doc/algorithms/tabular/autogluon.rst @@ -0,0 +1,28 @@ +############ +AutoGluon +############ + +`AutoGluon-Tabular `__ is a popular open-source AutoML framework that trains highly accurate machine learning models on an unprocessed tabular dataset. +Unlike existing AutoML frameworks that primarily focus on model and hyperparameter selection, AutoGluon-Tabular succeeds by ensembling multiple models and stacking them in multiple layers. + + +The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker AutoGluon-Tabular algorithm. + +.. list-table:: + :widths: 25 25 + :header-rows: 1 + + * - Notebook Title + - Description + * - `Tabular classification with Amazon SageMaker AutoGluon-Tabular algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker AutoGluon-Tabular algorithm to train and host a tabular classification model. + * - `Tabular regression with Amazon SageMaker AutoGluon-Tabular algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker AutoGluon-Tabular algorithm to train and host a tabular regression model. + + +For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see +`Use Amazon SageMaker Notebook Instances `__. After you have created a notebook +instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its +Use tab and choose Create copy. + +For detailed documentation, please refer to the `Sagemaker AutoGluon-Tabular Algorithm `__. diff --git a/doc/algorithms/tabular/catboost.rst b/doc/algorithms/tabular/catboost.rst new file mode 100644 index 0000000000..e7c72aa5c4 --- /dev/null +++ b/doc/algorithms/tabular/catboost.rst @@ -0,0 +1,37 @@ +############ +CatBoost +############ + + +`CatBoost `__ is a popular and high-performance open-source implementation of the Gradient Boosting Decision Tree (GBDT) +algorithm. GBDT is a supervised learning algorithm that attempts to accurately predict a target variable by combining an ensemble of +estimates from a set of simpler and weaker models. + +CatBoost introduces two critical algorithmic advances to GBDT: + +* The implementation of ordered boosting, a permutation-driven alternative to the classic algorithm + +* An innovative algorithm for processing categorical features + +Both techniques were created to fight a prediction shift caused by a special kind of target leakage present in all currently existing +implementations of gradient boosting algorithms. + +The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker CatBoost algorithm. + +.. list-table:: + :widths: 25 25 + :header-rows: 1 + + * - Notebook Title + - Description + * - `Tabular classification with Amazon SageMaker LightGBM and CatBoost algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker CatBoost algorithm to train and host a tabular classification model. + * - `Tabular regression with Amazon SageMaker LightGBM and CatBoost algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker CatBoost algorithm to train and host a tabular regression model. + +For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see +`Use Amazon SageMaker Notebook Instances `__. After you have created a notebook +instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its +Use tab and choose Create copy. + +For detailed documentation, please refer to the `Sagemaker CatBoost Algorithm `__. diff --git a/doc/algorithms/factorization_machines.rst b/doc/algorithms/tabular/factorization_machines.rst similarity index 97% rename from doc/algorithms/factorization_machines.rst rename to doc/algorithms/tabular/factorization_machines.rst index e6a509d167..77997702bf 100644 --- a/doc/algorithms/factorization_machines.rst +++ b/doc/algorithms/tabular/factorization_machines.rst @@ -1,4 +1,4 @@ -FactorizationMachines +Factorization Machines ------------------------- The Amazon SageMaker Factorization Machines algorithm. diff --git a/doc/algorithms/tabular/index.rst b/doc/algorithms/tabular/index.rst new file mode 100644 index 0000000000..029437fb39 --- /dev/null +++ b/doc/algorithms/tabular/index.rst @@ -0,0 +1,18 @@ +###################### +Tabular +###################### + +Amazon SageMaker provides built-in algorithms that are tailored to the analysis of tabular data. The built-in SageMaker algorithms for tabular data can be used for either classification or regression problems. + +.. toctree:: + :maxdepth: 2 + + autogluon + catboost + factorization_machines + knn + lightgbm + linear_learner + tabtransformer + xgboost + object2vec diff --git a/doc/algorithms/knn.rst b/doc/algorithms/tabular/knn.rst similarity index 100% rename from doc/algorithms/knn.rst rename to doc/algorithms/tabular/knn.rst diff --git a/doc/algorithms/tabular/lightgbm.rst b/doc/algorithms/tabular/lightgbm.rst new file mode 100644 index 0000000000..176b10cdba --- /dev/null +++ b/doc/algorithms/tabular/lightgbm.rst @@ -0,0 +1,28 @@ +############ +LightGBM +############ + +`LightGBM `__ is a popular and efficient open-source implementation of the Gradient Boosting +Decision Tree (GBDT) algorithm. GBDT is a supervised learning algorithm that attempts to accurately predict a target variable by +combining an ensemble of estimates from a set of simpler and weaker models. LightGBM uses additional techniques to significantly improve +the efficiency and scalability of conventional GBDT. + +The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker LightGBM algorithm. + +.. list-table:: + :widths: 25 25 + :header-rows: 1 + + * - Notebook Title + - Description + * - `Tabular classification with Amazon SageMaker LightGBM and CatBoost algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker LightGBM algorithm to train and host a tabular classification model. + * - `Tabular regression with Amazon SageMaker LightGBM and CatBoost algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker LightGBM algorithm to train and host a tabular regression model. + +For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see +`Use Amazon SageMaker Notebook Instances `__. After you have created a notebook +instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its +Use tab and choose Create copy. + +For detailed documentation, please refer to the `Sagemaker LightGBM Algorithm `__. diff --git a/doc/algorithms/linear_learner.rst b/doc/algorithms/tabular/linear_learner.rst similarity index 100% rename from doc/algorithms/linear_learner.rst rename to doc/algorithms/tabular/linear_learner.rst diff --git a/doc/algorithms/object2vec.rst b/doc/algorithms/tabular/object2vec.rst similarity index 100% rename from doc/algorithms/object2vec.rst rename to doc/algorithms/tabular/object2vec.rst diff --git a/doc/algorithms/tabular/tabtransformer.rst b/doc/algorithms/tabular/tabtransformer.rst new file mode 100644 index 0000000000..facebfcd83 --- /dev/null +++ b/doc/algorithms/tabular/tabtransformer.rst @@ -0,0 +1,28 @@ +############### +TabTransformer +############### + +`TabTransformer `__ is a novel deep tabular data modeling architecture for supervised learning. The TabTransformer architecture is built on self-attention-based Transformers. +The Transformer layers transform the embeddings of categorical features into robust contextual embeddings to achieve higher prediction accuracy. Furthermore, the contextual embeddings learned from TabTransformer +are highly robust against both missing and noisy data features, and provide better interpretability. + + +The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker TabTransformer algorithm. + +.. list-table:: + :widths: 25 25 + :header-rows: 1 + + * - Notebook Title + - Description + * - `Tabular classification with Amazon SageMaker TabTransformer algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker TabTransformer algorithm to train and host a tabular classification model. + * - `Tabular regression with Amazon SageMaker TabTransformer algorithm `__ + - This notebook demonstrates the use of the Amazon SageMaker TabTransformer algorithm to train and host a tabular regression model. + +For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see +`Use Amazon SageMaker Notebook Instances `__. After you have created a notebook +instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its +Use tab and choose Create copy. + +For detailed documentation, please refer to the `Sagemaker TabTransformer Algorithm `__. diff --git a/doc/algorithms/tabular/xgboost.rst b/doc/algorithms/tabular/xgboost.rst new file mode 100644 index 0000000000..829af00ac5 --- /dev/null +++ b/doc/algorithms/tabular/xgboost.rst @@ -0,0 +1,40 @@ +############ +XGBoost +############ + +The `XGBoost `__ (eXtreme Gradient Boosting) is a popular and efficient open-source implementation of the gradient boosted trees algorithm. Gradient boosting is a supervised learning algorithm that attempts to accurately predict a target variable +by combining an ensemble of estimates from a set of simpler and weaker models. The XGBoost algorithm performs well in machine learning competitions because of its robust handling of a variety of data types, relationships, distributions, and the variety of hyperparameters that you can +fine-tune. You can use XGBoost for regression, classification (binary and multiclass), and ranking problems. + +You can use the new release of the XGBoost algorithm either as a Amazon SageMaker built-in algorithm or as a framework to run training scripts in your local environments. This implementation has a smaller memory footprint, better logging, improved hyperparameter validation, and +an expanded set of metrics than the original versions. It provides an XGBoost estimator that executes a training script in a managed XGBoost environment. The current release of SageMaker XGBoost is based on the original XGBoost versions 1.0, 1.2, 1.3, and 1.5. + +The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker XGBoost algorithm. + +.. list-table:: + :widths: 25 25 + :header-rows: 1 + + * - Notebook Title + - Description + * - `How to Create a Custom XGBoost container? `__ + - This notebook shows you how to build a custom XGBoost Container with Amazon SageMaker Batch Transform. + * - `Regression with XGBoost using Parquet `__ + - This notebook shows you how to use the Abalone dataset in Parquet to train a XGBoost model. + * - `How to Train and Host a Multiclass Classification Model? `__ + - This notebook shows how to use the MNIST dataset to train and host a multiclass classification model. + * - `How to train a Model for Customer Churn Prediction? `__ + - This notebook shows you how to train a model to Predict Mobile Customer Departure in an effort to identify unhappy customers. + * - `An Introduction to Amazon SageMaker Managed Spot infrastructure for XGBoost Training `__ + - This notebook shows you how to use Spot Instances for training with a XGBoost Container. + * - `How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs? `__ + - This notebook shows you how to use Amazon SageMaker Debugger to monitor training jobs to detect inconsistencies. + * - `How to use Amazon SageMaker Debugger to debug XGBoost Training Jobs in Real-Time? `__ + - This notebook shows you how to use the MNIST dataset and Amazon SageMaker Debugger to perform real-time analysis of XGBoost training jobs while training jobs are running. + +For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see +`Use Amazon SageMaker Notebook Instances `__. After you have created a notebook +instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its +Use tab and choose Create copy. + +For detailed documentation, please refer to the `Sagemaker XGBoost Algorithm `__. diff --git a/doc/algorithms/text/blazing_text.rst b/doc/algorithms/text/blazing_text.rst new file mode 100644 index 0000000000..e42f4a0cc2 --- /dev/null +++ b/doc/algorithms/text/blazing_text.rst @@ -0,0 +1,27 @@ +############# +Blazing Text +############# + + +The Amazon SageMaker BlazingText algorithm provides highly optimized implementations of the Word2vec and text classification algorithms. The Word2vec algorithm is useful for many downstream natural language processing (NLP) +tasks, such as sentiment analysis, named entity recognition, machine translation, etc. Text classification is an important task for applications that perform web searches, information retrieval, ranking, and document classification. + +The Word2vec algorithm maps words to high-quality distributed vectors. The resulting vector representation of a word is called a word embedding. Words that are semantically similar correspond to vectors that are close together. +That way, word embeddings capture the semantic relationships between words. + +Many natural language processing (NLP) applications learn word embeddings by training on large collections of documents. These pretrained vector representations provide information about semantics and word distributions that +typically improves the generalizability of other models that are later trained on a more limited amount of data. Most implementations of the Word2vec algorithm are not optimized for multi-core CPU architectures. This makes it +difficult to scale to large datasets. + +With the BlazingText algorithm, you can scale to large datasets easily. Similar to Word2vec, it provides the Skip-gram and continuous bag-of-words (CBOW) training architectures. BlazingText's implementation of the supervised +multi-class, multi-label text classification algorithm extends the fastText text classifier to use GPU acceleration with custom `CUDA `__ + +kernels. You can train a model on more than a billion words in a couple of minutes using a multi-core CPU or a GPU. And, you achieve performance on par with the state-of-the-art deep learning text classification algorithms. + +The BlazingText algorithm is not parallelizable. For more information on parameters related to training, see `Docker Registry Paths for SageMaker Built-in Algorithms `__. + +For a sample notebook that uses the SageMaker BlazingText algorithm to train and deploy supervised binary and multiclass classification models, see +`Blazing Text classification on the DBPedia dataset `__. +For instructions for creating and accessing Jupyter notebook instances that you can use to run the example in SageMaker, see `Use Amazon SageMaker Notebook Instances `__. +After creating and opening a notebook instance, choose the SageMaker Examples tab to see a list of all the SageMaker examples. The topic modeling example notebooks that use the Blazing Text are located in the Introduction to Amazon +algorithms section. To open a notebook, choose its Use tab, then choose Create copy. diff --git a/doc/algorithms/text/index.rst b/doc/algorithms/text/index.rst new file mode 100644 index 0000000000..a24288fdc7 --- /dev/null +++ b/doc/algorithms/text/index.rst @@ -0,0 +1,22 @@ +###################### +Text +###################### + +Amazon SageMaker provides algorithms that are tailored to the analysis of textual documents used in natural language processing, document classification or summarization, topic modeling or classification, and language transcription or translation. + +.. toctree:: + :maxdepth: 2 + + blazing_text + lda + ntm + sequence_to_sequence + text_classification_tensorflow + sentence_pair_classification_tensorflow + sentence_pair_classification_hugging_face + question_answering_pytorch + named_entity_recognition_hugging_face + text_summarization_hugging_face + text_generation_hugging_face + machine_translation_hugging_face + text_embedding_tensorflow_mxnet diff --git a/doc/algorithms/lda.rst b/doc/algorithms/text/lda.rst similarity index 100% rename from doc/algorithms/lda.rst rename to doc/algorithms/text/lda.rst diff --git a/doc/algorithms/text/machine_translation_hugging_face.rst b/doc/algorithms/text/machine_translation_hugging_face.rst new file mode 100644 index 0000000000..d533d0e64d --- /dev/null +++ b/doc/algorithms/text/machine_translation_hugging_face.rst @@ -0,0 +1,10 @@ +##################################### +Machine Translation - HuggingFace +##################################### + + +This is a supervised machine translation algorithm which supports many pre-trained models available in Hugging Face. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Machine Translation for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK `. diff --git a/doc/algorithms/text/named_entity_recognition_hugging_face.rst b/doc/algorithms/text/named_entity_recognition_hugging_face.rst new file mode 100644 index 0000000000..fc0fbd212c --- /dev/null +++ b/doc/algorithms/text/named_entity_recognition_hugging_face.rst @@ -0,0 +1,10 @@ +######################################## +Named Entity Recognition - HuggingFace +######################################## + +This is a supervised named entity recognition algorithm which supports fine-tuning of many pre-trained models available in Hugging Face. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Named Entity Recognition for using these algorithms. + +For detailed documentation please refer `Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK `__ + diff --git a/doc/algorithms/ntm.rst b/doc/algorithms/text/ntm.rst similarity index 100% rename from doc/algorithms/ntm.rst rename to doc/algorithms/text/ntm.rst diff --git a/doc/algorithms/text/question_answering_pytorch.rst b/doc/algorithms/text/question_answering_pytorch.rst new file mode 100644 index 0000000000..9d9d74ccb1 --- /dev/null +++ b/doc/algorithms/text/question_answering_pytorch.rst @@ -0,0 +1,9 @@ +##################################### +Question Answering - PyTorch +##################################### + +This is a supervised question answering algorithm which supports fine-tuning of many pre-trained models available in Hugging Face. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Question Answering for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/text/sentence_pair_classification_hugging_face.rst b/doc/algorithms/text/sentence_pair_classification_hugging_face.rst new file mode 100644 index 0000000000..2892b9d516 --- /dev/null +++ b/doc/algorithms/text/sentence_pair_classification_hugging_face.rst @@ -0,0 +1,9 @@ +############################################ +Sentence Pair Classification - HuggingFace +############################################ + +This is a supervised sentence pair classification algorithm which supports fine-tuning of many pre-trained models available in Hugging Face. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Sentence Pair Classification for using these algorithms. + +For detailed documentation please refer `Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK `__ diff --git a/doc/algorithms/text/sentence_pair_classification_tensorflow.rst b/doc/algorithms/text/sentence_pair_classification_tensorflow.rst new file mode 100644 index 0000000000..80264e84f3 --- /dev/null +++ b/doc/algorithms/text/sentence_pair_classification_tensorflow.rst @@ -0,0 +1,9 @@ +############################################ +Sentence Pair Classification - TensorFlow +############################################ + +This is a supervised sentence pair classification algorithm which supports fine-tuning of many pre-trained models available in Tensorflow Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Sentence Pair Classification for using these algorithms. + +For detailed documentation please refer `Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK `__ diff --git a/doc/algorithms/text/sequence_to_sequence.rst b/doc/algorithms/text/sequence_to_sequence.rst new file mode 100644 index 0000000000..00d9302a01 --- /dev/null +++ b/doc/algorithms/text/sequence_to_sequence.rst @@ -0,0 +1,14 @@ +####################### +Sequence-to-Sequence +####################### + +Amazon SageMaker Sequence to Sequence is a supervised learning algorithm where the input is a sequence of tokens (for example, text, audio) and the output generated is another sequence of tokens. Example applications include: machine +translation (input a sentence from one language and predict what that sentence would be in another language), text summarization (input a longer string of words and predict a shorter string of words that is a summary), speech-to-text +(audio clips converted into output sentences in tokens). Recently, problems in this domain have been successfully modeled with deep neural networks that show a significant performance boost over previous methodologies. Amazon SageMaker +seq2seq uses Recurrent Neural Networks (RNNs) and Convolutional Neural Network (CNN) models with attention as encoder-decoder architectures. + +For a sample notebook that shows how to use the SageMaker Sequence to Sequence algorithm to train a English-German translation model, see +`Machine Translation English-German Example Using SageMaker Seq2Seq `__. +For instructions how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see `Use Amazon SageMaker Notebook Instances `__. Once you have +created a notebook instance and opened it, select the SageMaker Examples tab to see a list of all the SageMaker samples. The topic modeling example notebooks using the NTM algorithms are located in the Introduction to Amazon algorithms section. +To open a notebook, click on its Use tab and select Create copy. diff --git a/doc/algorithms/text/text_classification_tensorflow.rst b/doc/algorithms/text/text_classification_tensorflow.rst new file mode 100644 index 0000000000..c60a5b3e1c --- /dev/null +++ b/doc/algorithms/text/text_classification_tensorflow.rst @@ -0,0 +1,9 @@ +################################## +Text Classification - TensorFlow +################################## + +This is a supervised text classification algorithm which supports fine-tuning of many pre-trained models available in Tensorflow Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Text Classification for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/text/text_embedding_tensorflow_mxnet.rst b/doc/algorithms/text/text_embedding_tensorflow_mxnet.rst new file mode 100644 index 0000000000..d015c2ef30 --- /dev/null +++ b/doc/algorithms/text/text_embedding_tensorflow_mxnet.rst @@ -0,0 +1,9 @@ +#################################### +Text Embedding - TensorFlow, MxNet +#################################### + +This is a supervised text embedding algorithm which supports many pre-trained models available in MXNet and Tensorflow Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Text Embedding for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/text/text_generation_hugging_face.rst b/doc/algorithms/text/text_generation_hugging_face.rst new file mode 100644 index 0000000000..30fae26196 --- /dev/null +++ b/doc/algorithms/text/text_generation_hugging_face.rst @@ -0,0 +1,9 @@ +############################################ +Text Generation - HuggingFace +############################################ + +This is a supervised text generation algorithm which supports many pre-trained models available in Hugging Face. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Text Generation for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/text/text_summarization_hugging_face.rst b/doc/algorithms/text/text_summarization_hugging_face.rst new file mode 100644 index 0000000000..206c880ba3 --- /dev/null +++ b/doc/algorithms/text/text_summarization_hugging_face.rst @@ -0,0 +1,9 @@ +############################################ +Text Summarization - HuggingFace +############################################ + +This is a supervised text summarization algorithm which supports many pre-trained models available in Hugging Face. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Text Summarization for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/time_series/deep_ar.rst b/doc/algorithms/time_series/deep_ar.rst new file mode 100644 index 0000000000..c373cb7405 --- /dev/null +++ b/doc/algorithms/time_series/deep_ar.rst @@ -0,0 +1,11 @@ +################################## +Deep AR Forecasting +################################## + +The Amazon SageMaker DeepAR forecasting algorithm is a supervised learning algorithm for forecasting scalar (one-dimensional) time series using recurrent neural networks (RNN). Classical forecasting methods, such as autoregressive integrated moving average (ARIMA) or exponential smoothing (ETS), fit a single model to each individual time series. They then use that model to extrapolate the time series into the future. + +In many applications, however, you have many similar time series across a set of cross-sectional units. For example, you might have time series groupings for demand for different products, server loads, and requests for webpages. For this type of application, you can benefit from training a single model jointly over all of the time series. DeepAR takes this approach. When your dataset contains hundreds of related time series, DeepAR outperforms the standard ARIMA and ETS methods. You can also use the trained model to generate forecasts for new time series that are similar to the ones it has been trained on. + +The training input for the DeepAR algorithm is one or, preferably, more target time series that have been generated by the same process or similar processes. Based on this input dataset, the algorithm trains a model that learns an approximation of this process/processes and uses it to predict how the target time series evolves. Each target time series can be optionally associated with a vector of static (time-independent) categorical features provided by the cat field and a vector of dynamic (time-dependent) time series provided by the dynamic_feat field. SageMaker trains the DeepAR model by randomly sampling training examples from each target time series in the training dataset. Each training example consists of a pair of adjacent context and prediction windows with fixed predefined lengths. To control how far in the past the network can see, use the context_length hyperparameter. To control how far in the future predictions can be made, use the prediction_length hyperparameter. For more information, see `How the DeepAR Algorithm Works `__. + +For a sample notebook that shows how to prepare a time series dataset for training the SageMaker DeepAR algorithm and how to deploy the trained model for performing inferences, see `Time series forecasting with DeepAR - Synthetic data `__ as well as `DeepAR demo on electricity dataset `__, which illustrates the advanced features of DeepAR on a real world dataset. For instructions on creating and accessing Jupyter notebook instances that you can use to run the example in SageMaker, see `Use Amazon SageMaker Notebook Instances `__. After creating and opening a notebook instance, choose the SageMaker Examples tab to see a list of all of the SageMaker examples. To open a notebook, choose its Use tab, and choose Create copy. diff --git a/doc/algorithms/time_series/index.rst b/doc/algorithms/time_series/index.rst new file mode 100644 index 0000000000..05d2464ccd --- /dev/null +++ b/doc/algorithms/time_series/index.rst @@ -0,0 +1,10 @@ +###################### +Time-series +###################### + +Amazon SageMaker provides algorithms that are tailored to the analysis of textual documents used in natural language processing, document classification or summarization, topic modeling or classification, and language transcription or translation. + +.. toctree:: + :maxdepth: 2 + + deep_ar diff --git a/doc/algorithms/unsupervised/index.rst b/doc/algorithms/unsupervised/index.rst new file mode 100644 index 0000000000..a3e6af9801 --- /dev/null +++ b/doc/algorithms/unsupervised/index.rst @@ -0,0 +1,13 @@ +###################### +Unsupervised +###################### + +Amazon SageMaker provides several built-in algorithms that can be used for a variety of unsupervised learning tasks such as clustering, dimension reduction, pattern recognition, and anomaly detection. + +.. toctree:: + :maxdepth: 2 + + ipinsights + kmeans + pca + randomcutforest diff --git a/doc/algorithms/ipinsights.rst b/doc/algorithms/unsupervised/ipinsights.rst similarity index 100% rename from doc/algorithms/ipinsights.rst rename to doc/algorithms/unsupervised/ipinsights.rst diff --git a/doc/algorithms/kmeans.rst b/doc/algorithms/unsupervised/kmeans.rst similarity index 100% rename from doc/algorithms/kmeans.rst rename to doc/algorithms/unsupervised/kmeans.rst diff --git a/doc/algorithms/pca.rst b/doc/algorithms/unsupervised/pca.rst similarity index 100% rename from doc/algorithms/pca.rst rename to doc/algorithms/unsupervised/pca.rst diff --git a/doc/algorithms/randomcutforest.rst b/doc/algorithms/unsupervised/randomcutforest.rst similarity index 100% rename from doc/algorithms/randomcutforest.rst rename to doc/algorithms/unsupervised/randomcutforest.rst diff --git a/doc/algorithms/vision/image_classification_mxnet.rst b/doc/algorithms/vision/image_classification_mxnet.rst new file mode 100644 index 0000000000..1550a6026c --- /dev/null +++ b/doc/algorithms/vision/image_classification_mxnet.rst @@ -0,0 +1,18 @@ +############################# +Image Classification - MxNet +############################# + +The Amazon SageMaker image classification algorithm is a supervised learning algorithm that supports multi-label classification. It takes an image as input and outputs one or more labels assigned to that image. +It uses a convolutional neural network that can be trained from scratch or trained using transfer learning when a large number of training images are not available. + +The recommended input format for the Amazon SageMaker image classification algorithms is Apache MXNet `RecordIO `__. +However, you can also use raw images in .jpg or .png format. Refer to `this discussion `__ for a broad overview of efficient +data preparation and loading for machine learning systems. + +For a sample notebook that uses the SageMaker image classification algorithm to train a model on the caltech-256 dataset and then to deploy it to perform inferences, see the +`End-to-End Multiclass Image Classification Example `__. +For instructions how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see `Use Amazon SageMaker Notebook Instances `__. +Once you have created a notebook instance and opened it, select the SageMaker Examples tab to see a list of all the SageMaker samples. The example image classification notebooks are located in the Introduction to Amazon +algorithms section. To open a notebook, click on its Use tab and select Create copy. + +For detailed documentation, please refer to the `Sagemaker Image Classification Algorithm `__ diff --git a/doc/algorithms/vision/image_classification_pytorch.rst b/doc/algorithms/vision/image_classification_pytorch.rst new file mode 100644 index 0000000000..3c154c6cfe --- /dev/null +++ b/doc/algorithms/vision/image_classification_pytorch.rst @@ -0,0 +1,9 @@ +############################### +Image Classification - PyTorch +############################### + +This is a supervised image clasification algorithm which supports fine-tuning of many pre-trained models available in Pytorch Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Image Classification for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/image_classification_tensorflow.rst b/doc/algorithms/vision/image_classification_tensorflow.rst new file mode 100644 index 0000000000..e49820ee50 --- /dev/null +++ b/doc/algorithms/vision/image_classification_tensorflow.rst @@ -0,0 +1,9 @@ +################################## +Image Classification - TensorFlow +################################## + +This is a supervised image clasification algorithm which supports fine-tuning of many pre-trained models available in Tensorflow Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Image Classification for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/image_embedding_tensorflow.rst b/doc/algorithms/vision/image_embedding_tensorflow.rst new file mode 100644 index 0000000000..0938377354 --- /dev/null +++ b/doc/algorithms/vision/image_embedding_tensorflow.rst @@ -0,0 +1,9 @@ +############################# +Image Embedding - TensorFlow +############################# + +This is a supervised image embedding algorithm which supports many pre-trained models available in Tensorflow Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Image Embedding for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/index.rst b/doc/algorithms/vision/index.rst new file mode 100644 index 0000000000..50af5003b1 --- /dev/null +++ b/doc/algorithms/vision/index.rst @@ -0,0 +1,20 @@ +###################### +Vision +###################### + +Amazon SageMaker provides image processing algorithms that are used for image classification, object detection, and computer vision. + +.. toctree:: + :maxdepth: 2 + + image_classification_mxnet + image_classification_pytorch + image_classification_tensorflow + object_detection_mxnet_gluoncv + object_detection_mxnet + object_detection_pytorch + object_detection_tensorflow + semantic_segmentation_mxnet_gluoncv + semantic_segmentation_mxnet + instance_segmentation_mxnet + image_embedding_tensorflow diff --git a/doc/algorithms/vision/instance_segmentation_mxnet.rst b/doc/algorithms/vision/instance_segmentation_mxnet.rst new file mode 100644 index 0000000000..a38611bc9a --- /dev/null +++ b/doc/algorithms/vision/instance_segmentation_mxnet.rst @@ -0,0 +1,9 @@ +############################## +Instance Segmentation - MXNet +############################## + +This is a supervised image segmentation algorithm which supports many pre-trained models available in MXNet. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Image Segmentation for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/object_detection_mxnet.rst b/doc/algorithms/vision/object_detection_mxnet.rst new file mode 100644 index 0000000000..9ce52f992b --- /dev/null +++ b/doc/algorithms/vision/object_detection_mxnet.rst @@ -0,0 +1,9 @@ +########################## +Object Detection - MxNet +########################## + +This is a supervised object detection algorithm which supports fine-tuning of many pre-trained models available in MXNet. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Object Detection for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/object_detection_mxnet_gluoncv.rst b/doc/algorithms/vision/object_detection_mxnet_gluoncv.rst new file mode 100644 index 0000000000..857360b68e --- /dev/null +++ b/doc/algorithms/vision/object_detection_mxnet_gluoncv.rst @@ -0,0 +1,19 @@ +################################## +Object Detection - MxNet GluonCV +################################## + + +The Amazon SageMaker Object Detection algorithm detects and classifies objects in images using a single deep neural network. +It is a supervised learning algorithm that takes images as input and identifies all instances of objects within the image scene. +The object is categorized into one of the classes in a specified collection with a confidence score that it belongs to the class. +Its location and scale in the image are indicated by a rectangular bounding box. It uses the `Single Shot multibox Detector (SSD) `__ +framework and supports two base networks: `VGG `__ and `ResNet `__. The network can be trained from scratch, +or trained with models that have been pre-trained on the `ImageNet `__ dataset. + +For a sample notebook that shows how to use the SageMaker Object Detection algorithm to train and host a model on the `Caltech Birds (CUB 200 2011) `__ +dataset using the Single Shot multibox Detector algorithm, see `Amazon SageMaker Object Detection for Bird Species `__. +For instructions how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see `Use Amazon SageMaker Notebook Instances `__. +Once you have created a notebook instance and opened it, select the SageMaker Examples tab to see a list of all the SageMaker samples. The object detection example notebook using the Object Detection +algorithm is located in the Introduction to Amazon Algorithms section. To open a notebook, click on its Use tab and select Create copy. + +For detailed documentation, please refer to the `Sagemaker Object Detection Algorithm `__ diff --git a/doc/algorithms/vision/object_detection_pytorch.rst b/doc/algorithms/vision/object_detection_pytorch.rst new file mode 100644 index 0000000000..aa703e74b5 --- /dev/null +++ b/doc/algorithms/vision/object_detection_pytorch.rst @@ -0,0 +1,9 @@ +########################### +Object Detection - PyTorch +########################### + +This is a supervised object detection algorithm which supports fine-tuning of many pre-trained models available in Pytorch Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Object Detection for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/object_detection_tensorflow.rst b/doc/algorithms/vision/object_detection_tensorflow.rst new file mode 100644 index 0000000000..2536322847 --- /dev/null +++ b/doc/algorithms/vision/object_detection_tensorflow.rst @@ -0,0 +1,9 @@ +############################### +Object Detection - TensorFlow +############################### + +This is a supervised object detection algorithm which supports fine-tuning of many pre-trained models available in Tensorflow Hub. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Object Detection for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/semantic_segmentation_mxnet.rst b/doc/algorithms/vision/semantic_segmentation_mxnet.rst new file mode 100644 index 0000000000..b0c60cd560 --- /dev/null +++ b/doc/algorithms/vision/semantic_segmentation_mxnet.rst @@ -0,0 +1,9 @@ +############################## +Semantic Segmentation - MxNet +############################## + +This is a supervised semantic segmentation algorithm which supports fine-tuning of many pre-trained models available in MXNet. The following +`sample notebook `__ +demonstrates how to use the Sagemaker Python SDK for Semantic Segmentation for using these algorithms. + +For detailed documentation please refer :ref:`Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ` diff --git a/doc/algorithms/vision/semantic_segmentation_mxnet_gluoncv.rst b/doc/algorithms/vision/semantic_segmentation_mxnet_gluoncv.rst new file mode 100644 index 0000000000..53e532f6ea --- /dev/null +++ b/doc/algorithms/vision/semantic_segmentation_mxnet_gluoncv.rst @@ -0,0 +1,43 @@ +##################################### +Semantic Segmentation - MxNet GluonCV +##################################### + +The SageMaker semantic segmentation algorithm provides a fine-grained, pixel-level approach to developing computer vision applications. +It tags every pixel in an image with a class label from a predefined set of classes. Tagging is fundamental for understanding scenes, which is +critical to an increasing number of computer vision applications, such as self-driving vehicles, medical imaging diagnostics, and robot sensing. + +For comparison, the `SageMaker Image Classification Algorithm `__ is a +supervised learning algorithm that analyzes only whole images, classifying them into one of multiple output categories. The +`Object Detection Algorithm `__ is a supervised learning algorithm that detects and +classifies all instances of an object in an image. It indicates the location and scale of each object in the image with a rectangular bounding box. + +Because the semantic segmentation algorithm classifies every pixel in an image, it also provides information about the shapes of the objects contained in the image. +The segmentation output is represented as a grayscale image, called a segmentation mask. A segmentation mask is a grayscale image with the same shape as the input image. + +The SageMaker semantic segmentation algorithm is built using the `MXNet Gluon framework and the Gluon CV toolkit `__ +. It provides you with a choice of three built-in algorithms to train a deep neural network. You can use the `Fully-Convolutional Network (FCN) algorithm `__ , +`Pyramid Scene Parsing (PSP) algorithm `__, or `DeepLabV3 `__. + + +Each of the three algorithms has two distinct components: + +* The backbone (or encoder)—A network that produces reliable activation maps of features. + +* The decoder—A network that constructs the segmentation mask from the encoded activation maps. + +You also have a choice of backbones for the FCN, PSP, and DeepLabV3 algorithms: `ResNet50 or ResNet101 `__. +These backbones include pretrained artifacts that were originally trained on the `ImageNet `__ classification task. You can fine-tune these backbones +for segmentation using your own data. Or, you can initialize and train these networks from scratch using only your own data. The decoders are never pretrained. + +To deploy the trained model for inference, use the SageMaker hosting service. During inference, you can request the segmentation mask either as a +PNG image or as a set of probabilities for each class for each pixel. You can use these masks as part of a larger pipeline that includes additional downstream image processing or other applications. + + +For a sample Jupyter notebook that uses the SageMaker semantic segmentation algorithm to train a model and deploy it to perform inferences, see the +`Semantic Segmentation Example `__. For instructions +on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see `Use Amazon SageMaker Notebook Instances `__. + +To see a list of all of the SageMaker samples, create and open a notebook instance, and choose the SageMaker Examples tab. The example semantic segmentation notebooks are located under +Introduction to Amazon algorithms. To open a notebook, choose its Use tab, and choose Create copy. + +For detailed documentation, please refer to the `Sagemaker Semantic Segmentation Algorithm `__ diff --git a/doc/doc_utils/jumpstart_doc_utils.py b/doc/doc_utils/jumpstart_doc_utils.py index 94096fbf1d..92a418a6b4 100644 --- a/doc/doc_utils/jumpstart_doc_utils.py +++ b/doc/doc_utils/jumpstart_doc_utils.py @@ -140,6 +140,7 @@ def create_jumpstart_model_table(): file_content = [] + file_content.append(".. _all-pretrained-models:\n\n") file_content.append(".. |external-link| raw:: html\n\n") file_content.append(' \n\n') diff --git a/doc/index.rst b/doc/index.rst index c0269452f9..2d4ebe32c1 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -39,7 +39,7 @@ The SageMaker Python SDK supports managed training and inference for a variety o ******************************** -SageMaker First-Party Algorithms +SageMaker Built-in Algorithms ******************************** Amazon SageMaker provides implementations of some common machine learning algorithms optimized for GPU architecture and massive datasets. diff --git a/doc/overview.rst b/doc/overview.rst index a6deb7b988..93dd652b06 100644 --- a/doc/overview.rst +++ b/doc/overview.rst @@ -573,6 +573,8 @@ Here is an example: # When you are done using your endpoint model.sagemaker_session.delete_endpoint('my-endpoint') +.. _built-in-algos: + *********************************************************************** Use Built-in Algorithms with Pre-trained Models in SageMaker Python SDK ***********************************************************************