You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/amazon_sagemaker_processing.rst
+24-24
Original file line number
Diff line number
Diff line change
@@ -24,10 +24,10 @@ The fastest way to run get started with Amazon SageMaker Processing is by runnin
24
24
You can run notebooks on Amazon SageMaker that demonstrate end-to-end examples of using processing jobs to perform data pre-processing, feature engineering and model evaluation steps. See `Learn More`_ at the bottom of this page for more in-depth information.
25
25
26
26
27
-
Data Pre-Processing and Model Evaluation with Scikit-Learn
You can run a Scikit-Learn script to do data processing on SageMaker using the :class:`sagemaker.sklearn.processing.SKLearnProcessor` class.
30
+
You can run a scikit-learn script to do data processing on SageMaker using the :class:`sagemaker.sklearn.processing.SKLearnProcessor` class.
31
31
32
32
You first create a ``SKLearnProcessor``
33
33
@@ -36,35 +36,35 @@ You first create a ``SKLearnProcessor``
36
36
from sagemaker.sklearn.processing import SKLearnProcessor
37
37
38
38
sklearn_processor = SKLearnProcessor(
39
-
framework_version='0.20.0',
40
-
role='[Your SageMaker-compatible IAM role]',
41
-
instance_type='ml.m5.xlarge',
39
+
framework_version="0.20.0",
40
+
role="[Your SageMaker-compatible IAM role]",
41
+
instance_type="ml.m5.xlarge",
42
42
instance_count=1,
43
43
)
44
44
45
-
Then you can run a Scikit-Learn script ``preprocessing.py`` in a processing job. In this example, our script takes one input from S3 and one command-line argument, processes the data, then splits the data into two datasets for output. When the job is finished, we can retrive the output from S3.
45
+
Then you can run a scikit-learn script ``preprocessing.py`` in a processing job. In this example, our script takes one input from S3 and one command-line argument, processes the data, then splits the data into two datasets for output. When the job is finished, we can retrive the output from S3.
46
46
47
47
.. code:: python
48
48
49
49
from sagemaker.processing import ProcessingInput, ProcessingOutput
For an in-depth look, please see the `Scikit-Learn Data Processing and Model Evaluation`_ example notebook.
65
+
For an in-depth look, please see the `Scikit-learn Data Processing and Model Evaluation`_ example notebook.
66
66
67
-
.. _Scikit-Learn Data Processing and Model Evaluation: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_processing/scikit_learn_data_processing_and_model_evaluation/scikit_learn_data_processing_and_model_evaluation.ipynb
67
+
.. _Scikit-learn Data Processing and Model Evaluation: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_processing/scikit_learn_data_processing_and_model_evaluation/scikit_learn_data_processing_and_model_evaluation.ipynb
68
68
69
69
70
70
Data Pre-Processing with Spark
@@ -79,26 +79,26 @@ This example shows how you can run a processing job inside of a container that c
79
79
from sagemaker.processing import ScriptProcessor, ProcessingInput
80
80
81
81
spark_processor = ScriptProcessor(
82
-
base_job_name='spark-preprocessor',
83
-
image_uri='<ECR repository URI to your Spark processing image>',
84
-
command=['/opt/program/submit'],
82
+
base_job_name="spark-preprocessor",
83
+
image_uri="<ECR repository URI to your Spark processing image>",
0 commit comments