Skip to content

Commit 1618855

Browse files
authored
Merge pull request aws#63 from awslabs/arpin_bash_organize
Updated: READMEs and organization Merging in preparation for Monday's meeting
2 parents ec2c3a4 + e021de7 commit 1618855

File tree

12 files changed

+87
-38
lines changed

12 files changed

+87
-38
lines changed

README.md

+33-8
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ These examples provide a gentle introduction to machine learning concepts as the
1717

1818
These examples provide quick walkthroughs to get you up and running with Amazon SageMaker's custom developed algorithms. Most of these algorithms can train on distributed hardware, scale incredibly well, and are faster and cheaper than popular alternatives.
1919

20-
- [k-means](introduction_to_amazon_algorithms/1P_kmeans_highlevel) is our introductory example for Amazon SageMaker. It walks through the process of clustering MNIST images of handwritten digits using Amazon SageMaker k-means.
20+
- [k-means](sagemaker-python-sdk/1P_kmeans_highlevel) is our introductory example for Amazon SageMaker. It walks through the process of clustering MNIST images of handwritten digits using Amazon SageMaker k-means.
2121
- [Factorization Machines](introduction_to_amazon_algorithms/factorization_machines_mnist) showcases Amazon SageMaker's implementation of the algorithm to predict whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier.
2222
- [Latent Dirichlet Allocation (LDA)](introduction_to_amazon_algorithms/lda_topic_modeling) introduces topic modeling using Amazon SageMaker Latent Dirichlet Allocation (LDA) on a synthetic dataset.
2323
- [Linear Learner](introduction_to_amazon_algorithms/linear_learner_mnist) predicts whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier from Amazon SageMaker Linear Learner.
@@ -35,19 +35,44 @@ These examples provide more thorough mathematical treatment on a select group of
3535

3636
### Advanced Amazon SageMaker Functionality
3737

38+
These examples that showcase unique functionality available in Amazon SageMaker. They cover a broad range of topics and will utilize a variety of methods, but aim to provide the user with sufficient insight or inspiration to develop within Amazon SageMaker.
39+
40+
- [Data Distribution Types](advanced_functionality/data_distribution_types) showcases the difference between two methods for sending data from S3 to Amazon SageMaker Training instances. This has particular implication for scalability and accuracy of distributed training.
41+
- [Encrypting Your Data](advanced_functionality/handling_kms_encrypted_data) shows how to use Server Side KMS encrypted data with Amazon SageMaker training works. The IAM role used for S3 access needs to have permissions to encrypt and decrypt data with the KMS key.
42+
- [Using Parquet Data](advanced_functionality/parquet_to_recordio_protobuf) shows how to bring [Parquet](https://parquet.apache.org/) data sitting in S3 into an Amazon SageMaker Notebook and convert it into the recordIO-protobuf format that many SageMaker algorithms consume.
43+
- [Connecting to Redshift](advanced_functionality/working_with_redshift_data) demonstrates how to copy data from Redshift to S3 and vice-versa without leaving Amazon SageMaker Notebooks.
44+
- [Bring Your Own XGBoost Model](advanced_functionality/xgboost_bring_your_own_model) shows how to use Amazon SageMaker Algorithms containers to bring a pre-trained model to a realtime hosted endpoint without ever needing to think about REST APIs.
45+
- [Bring Your Own k-means Model](advanced_functionality/kmeans_bring_your_own_model) shows how to take a model that's been fit elsewhere and use Amazon SageMaker Algorithms containers to host it.
3846
- [Installing the R Kernel](advanced_functionality/install_r_kernel) shows how to install the R kernel into an Amazon SageMaker Notebook Instance.
39-
- [Bring Your Own Model for k-means](advanced_functionality/kmeans_bring_your_own_model) shows how to take a model that's been fit elsewhere and use Amazon SageMaker Algorithms containers to host it.
40-
- [Bring Your Own Algorithm with R](advanced_functionality/r_bring_your_own) shows how to bring your own algorithm container to Amazon SageMaker using the R language.
41-
- [Bring Your Own Tensorflow Model](sagemaker-python-sdk/tensorflow_iris_byom) shows how to bring a model trained anywhere into Amazon SageMaker
42-
- [Bring Your Own MXNet Model](sagemaker-python-sdk/tensorflow_iris_byom) shows how to bring a model trained anywhere using MXNet into Amazon SageMaker
43-
- [Bring Your Own TensorFlow Model](sagemaker-python-sdk/tensorflow_iris_byom) shows how to bring a model trained anywhere using TensorFlow into Amazon SageMaker
47+
- [Bring Your Own R Algorithm](advanced_functionality/r_bring_your_own) shows how to bring your own algorithm container to Amazon SageMaker using the R language.
48+
- [Bring Your Own scikit Algorithm](advanced_functionality/scikit_bring_your_own) provides a detailed walkthrough on how to package a scikit learn algorithm for training and production-ready hosting.
49+
50+
### Amazon SageMaker TensorFlow and MXNet Pre-Built Containers and the Python SDDK
51+
52+
These examples focus on the Amazon SageMaker Python SDK which allows you to write idiomatic TensorFlow or MXNet and then train or host in pre-built containers.
53+
54+
- [cifar 10 with MXNet Gluon](sagemaker-python-sdk/mxnet_gluon_cifar10)
55+
- [MNIST with MXNet Gluon](sagemaker-python-sdk/mxnet_gluon_mnist)
56+
- [MNIST with MXNet](sagemaker-python-sdk/mxnet_mnist)
57+
- [TensorFlow Neural Networks with Layers](sagemaker-python-sdk/tensorflow_abalone_age_predictor_using_layers)
58+
- [TensorFlow Networks with Keras](sagemaker-python-sdk/tensorflow_abalone_age_predictor_keras)
59+
- [Introduction to Estimators in TensorFlow](sagemaker-python-sdk/tensorflow_iris_dnn_classifier_using_estimators)
60+
- [TensorFlow and TensorBoard](sagemaker-python-sdk/tensorflow_resnet_cifar10_with_tensorboard)
61+
62+
### Under Development
63+
64+
These Amazon SageMaker examples fully illustrate a concept, but may require some additional configuration on the users part to complete.
65+
66+
- [Bring Your Own MXNet Model](under_development/tensorflow_iris_byom) shows how to bring a model trained anywhere using MXNet into Amazon SageMaker
67+
- [Bring Your Own TensorFlow Model](under_development/tensorflow_iris_byom) shows how to bring a model trained anywhere using TensorFlow into Amazon SageMaker
68+
- [Ensembling Multiple Models](under_development/modeling) creates two different models for prediction, hosts them independently and shows how their outputs can be combined for better accuracy than either one alone.
4469

4570
## FAQ
4671

47-
*Will these examples work outside of Amazon SageMaker?*
72+
*Will these examples work outside of Amazon SageMaker Notebook Instances?*
4873

4974
- Although most examples utilize key Amazon SageMaker functionality like distributed, managed training or real-time hosted endpoints, these notebooks can be run outside of Amazon SageMaker Notebook Instances with minimal modification (updating IAM role definition and installing the necessary libraries).
5075

5176
*How do I contribute my own example notebook?*
5277

53-
- Although we're extremely excited to receive contributions from the community, we're still working on the best mechanism to take in examples from and external source. Please bear with us in the short-term if pull requests take longer than expected or are closed.
78+
- Although we're extremely excited to receive contributions from the community, we're still working on the best mechanism to take in examples from and external source. Please bear with us in the short-term if pull requests take longer than expected or are closed.

advanced_functionality/README.md

+13-11
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
1-
# Advanced Functionality
1+
# Amazon SageMaker Examples
22

3-
This directory includes examples which showcase unique functionality available in Amazon SageMaker. Examples cover a broad range of topics and will utilize a variety of methods, but aim to provide the user with sufficient insight or inspiration to develop within Amazon SageMaker.
3+
### Advanced Amazon SageMaker Functionality
44

5-
Example Notebooks include:
6-
- *data_distribution_types*: Showcases the difference between two methods for sending data from S3 to Amazon SageMaker Training instances. This has particular implication for scalability and accuracy of distributed training.
7-
- *install_r_kernel*: A quick introduction to getting R installed and running within Amazon SageMaker Notebook Instances.
8-
- *kmeans_bring_your_own_model*: How to use Amazon SageMaker Algorithms containers to bring a pre-trained model to a realtime hosted endpoint without ever needing to think about REST APIs.
9-
- *r_bring_your_own*: How to containerize an R algorithm using Docker and plumber for hosting so that it can be used in Amazon SageMaker's managed training and realtime hosting.
10-
- *xgboost_bring_your_own_model*: How to use Amazon SageMaker Algorithms containers to bring a pre-trained model to a realtime hosted endpoint without ever needing to think about REST APIs.
11-
- *handling_kms_encrypted_data.ipynb*: How to use Server Side KMS encrypted data with Amazon SageMaker training works. The IAM role used for S3 access needs to have permissions to encrypt and decrypt data with the KMS key.
12-
- *parquet_to_recordio_protobuf.ipynb*: How to convert Parquet data format into the recordIO-protobuf format that many SageMaker algorithms consume.
13-
- *working_with_redshift_data.ipynb*: Demonstrates how to copy data from Redshift to S3 and vice-versa.
5+
These examples that showcase unique functionality available in Amazon SageMaker. They cover a broad range of topics and will utilize a variety of methods, but aim to provide the user with sufficient insight or inspiration to develop within Amazon SageMaker.
6+
7+
- [Data Distribution Types](data_distribution_types) showcases the difference between two methods for sending data from S3 to Amazon SageMaker Training instances. This has particular implication for scalability and accuracy of distributed training.
8+
- [Encrypting Your Data](handling_kms_encrypted_data) shows how to use Server Side KMS encrypted data with Amazon SageMaker training works. The IAM role used for S3 access needs to have permissions to encrypt and decrypt data with the KMS key.
9+
- [Using Parquet Data](parquet_to_recordio_protobuf) shows how to bring [Parquet](https://parquet.apache.org/) data sitting in S3 into an Amazon SageMaker Notebook and convert it into the recordIO-protobuf format that many SageMaker algorithms consume.
10+
- [Connecting to Redshift](working_with_redshift_data) demonstrates how to copy data from Redshift to S3 and vice-versa without leaving Amazon SageMaker Notebooks.
11+
- [Bring Your Own XGBoost Model](xgboost_bring_your_own_model) shows how to use Amazon SageMaker Algorithms containers to bring a pre-trained model to a realtime hosted endpoint without ever needing to think about REST APIs.
12+
- [Bring Your Own k-means Model](kmeans_bring_your_own_model) shows how to take a model that's been fit elsewhere and use Amazon SageMaker Algorithms containers to host it.
13+
- [Installing the R Kernel](install_r_kernel) shows how to install the R kernel into an Amazon SageMaker Notebook Instance.
14+
- [Bring Your Own R Algorithm](r_bring_your_own) shows how to bring your own algorithm container to Amazon SageMaker using the R language.
15+
- [Bring Your Own scikit Algorithm](advanced_functionality/scikit_bring_your_own) provides a detailed walkthrough on how to package a scikit learn algorithm for training and production-ready hosting.
+13-13
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
1-
# Introduction to Amazon Algorithms
1+
# Amazon SageMaker Examples
22

3-
This directory includes introductory examples to Amazon SageMaker Algorithms that we have developed so far. It seeks to provide guidance and examples on basic functionality rather than a detailed scientific review or an implementation on complex, real-world data.
3+
### Introduction to Amazon Algorithms
44

5-
Example Notebooks include:
6-
- *1P_kmeans_highlevel*: Our introduction to Amazon SageMaker which walks through the process of clustering MNIST images of handwritten digits.
7-
- *factorization_machines_mnist*: Predicts whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier from Amazon SageMaker Factorization Machines.
8-
- *lda_topic_modeling*: Topic modeling using Amazon SageMaker Latent Dirichlet Allocation (LDA) on a synthetic dataset.
9-
- *linear_mnist*: Predicts whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier from Amazon SageMaker Linear Learner.
10-
- *ntm_synthetic*: Uses Amazon SageMaker Neural Topic Model (NTM) to uncover topics in documents from a synthetic data source, where topic distributions are known.
11-
- *pca_mnist*: Uses Amazon SageMaker Principal Components Analysis (PCA) to calculate eigendigits from MNIST.
12-
- *seq2seq*: Seq2Seq algorithm is built on top of [Sockeye](https://github.com/awslabs/sockeye), a sequence-to-sequence framework for Neural Machine Translation based on MXNet. SageMaker Seq2Seq implements state-of-the-art encoder-decoder architectures which can also be used for tasks like Abstractive Summarization in addition to Machine Translation.
13-
- *xgboost_abalone*: Predicts the age of abalone ([Abalone dataset](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html)) using regression from Amazon SageMaker XGBoost.
14-
- *xgboost_mnist*: Uses Amazon SageMaker XGBoost to classifiy handwritten digits from the MNIST dataset into one of the ten digits using a multi-class classifier. Both single machine and distributed use-cases are presented.
15-
- *image_classification*: Uses Amazon SageMaker Image classification algorithm to train a Resnet on [caltech dataset](http://www.vision.caltech.edu/Image_Datasets/Caltech256/), either from scratch or using a pre-trained model.
5+
These examples provide quick walkthroughs to get you up and running with Amazon SageMaker's custom developed algorithms. Most of these algorithms can train on distributed hardware, scale incredibly well, and are faster and cheaper than popular alternatives.
6+
7+
- [k-means](../sagemaker-python-sdk/1P_kmeans_highlevel) is our introductory example for Amazon SageMaker. It walks through the process of clustering MNIST images of handwritten digits using Amazon SageMaker k-means.
8+
- [Factorization Machines](factorization_machines_mnist) showcases Amazon SageMaker's implementation of the algorithm to predict whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier.
9+
- [Latent Dirichlet Allocation (LDA)](lda_topic_modeling) introduces topic modeling using Amazon SageMaker Latent Dirichlet Allocation (LDA) on a synthetic dataset.
10+
- [Linear Learner](linear_learner_mnist) predicts whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier from Amazon SageMaker Linear Learner.
11+
- [Neural Topic Model (NTM)](ntm_synthetic) uses Amazon SageMaker Neural Topic Model (NTM) to uncover topics in documents from a synthetic data source, where topic distributions are known.
12+
- [Principal Components Analysis (PCA)](pca_mnist) uses Amazon SageMaker PCA to calculate eigendigits from MNIST.
13+
- [Seq2Seq](seq2seq) uses the Amazon SageMaker Seq2Seq algorithm that's built on top of [Sockeye](https://github.com/awslabs/sockeye), which is a sequence-to-sequence framework for Neural Machine Translation based on MXNet. Seq2Seq implements state-of-the-art encoder-decoder architectures which can also be used for tasks like Abstractive Summarization in addition to Machine Translation. This notebook shows translation from English to German text.
14+
- [XGBoost for regression](xgboost_abalone) predicts the age of abalone ([Abalone dataset](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html)) using regression from Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost).
15+
- [XGBoost for multi-class classification](xgboost_mnist) uses Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost) to classifiy handwritten digits from the MNIST dataset as one of the ten digits using a multi-class classifier. Both single machine and distributed use-cases are presented.
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
1-
# Introduction to Applying Machine Learning
1+
# Amazon SageMaker Examples
22

3-
This directory includes introductory examples for applying machine learning to real world problems using Amazon SageMaker. The notebooks tend to provide descriptive overviews rather than dense mathematical content to provide an easy on-ramp to using machine learning in common problems faced across a variety of industries.
3+
### Introduction to Applying Machine Learning
44

5-
Example Notebooks include:
6-
- *linear_time_series_forecast*: Generate a time-series forecast for product demand using Amazon SageMaker's Linear Learner algorithm.
7-
- *xgboost_customer_churn*: Predict customer churn by building a highly predictive gradient boosted trees model using Amazon SageMaker's implementation of the popular open source XGBoost package.
8-
- *xgboost_direct_marketing*: Predict successful conversion of prospective customers using gradient boosted trees from Amazon SageMaker's implementation of the popular open source XGBoost package.
5+
These examples provide a gentle introduction to machine learning concepts as they are applied in practical use cases across a variety of sectors.
6+
7+
- [Targeted Direct Marketing](xgboost_direct_marketing) predicts potential customers that are most likely to convert based on customer and aggregate level metrics, using Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost).
8+
- [Predicting Customer Churn](xgboost_customer_churn) uses customer interaction and service usage data to find those most likely to churn, and then walks through the cost/benefit trade-offs of providing retention incentives. This uses Amazon SageMaker's implementation of [XGBoost](https://github.com/dmlc/xgboost) to create a highly predictive model.
9+
- [Time-series Forecasting](linear_time_series_forecast) generates a forecast for topline product demand using Amazon SageMaker's Linear Learner algorithm.
10+
- [Cancer Prediction](breast_cancer_prediction) predicts Breast Cancer based on features derived from images, using SageMaker's Linear Learner.

sagemaker-python-sdk/README.md

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Amazon SageMaker Examples
2+
3+
### Amazon SageMaker TensorFlow and MXNet Pre-Built Containers and the Python SDDK
4+
5+
These examples focus on the Amazon SageMaker Python SDK which allows you to write idiomatic TensorFlow or MXNet and then train or host in pre-built containers.
6+
7+
- [cifar 10 with MXNet Gluon](mxnet_gluon_cifar10)
8+
- [MNIST with MXNet Gluon](mxnet_gluon_mnist)
9+
- [MNIST with MXNet](mxnet_mnist)
10+
- [TensorFlow Neural Networks with Layers](tensorflow_abalone_age_predictor_using_layers)
11+
- [TensorFlow Networks with Keras](tensorflow_abalone_age_predictor_keras)
12+
- [Introduction to Estimators in TensorFlow](tensorflow_iris_dnn_classifier_using_estimators)
13+
- [TensorFlow and TensorBoard](tensorflow_resnet_cifar10_with_tensorboard)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Amazon SageMaker Examples
2+
3+
### Scientific Details of Algorithms
4+
5+
These examples provide more thorough mathematical treatment on a select group of algorithms.
6+
7+
- [Latent Dirichlet Allocation (LDA)](lda_topic_modeling) dives into Amazon SageMaker's spectral decomposition approach to LDA.

0 commit comments

Comments
 (0)