Skip to content

Commit 5aa0ff8

Browse files
committed
address PR comments
1 parent 6d16f34 commit 5aa0ff8

File tree

2 files changed

+27
-28
lines changed

2 files changed

+27
-28
lines changed

doc/amazon_sagemaker_processing.rst

+24-24
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,10 @@ The fastest way to run get started with Amazon SageMaker Processing is by runnin
2424
You can run notebooks on Amazon SageMaker that demonstrate end-to-end examples of using processing jobs to perform data pre-processing, feature engineering and model evaluation steps. See `Learn More`_ at the bottom of this page for more in-depth information.
2525

2626

27-
Data Pre-Processing and Model Evaluation with Scikit-Learn
28-
==================================================================
27+
Data Pre-Processing and Model Evaluation with scikit-learn
28+
==========================================================
2929

30-
You can run a Scikit-Learn script to do data processing on SageMaker using the :class:`sagemaker.sklearn.processing.SKLearnProcessor` class.
30+
You can run a scikit-learn script to do data processing on SageMaker using the :class:`sagemaker.sklearn.processing.SKLearnProcessor` class.
3131

3232
You first create a ``SKLearnProcessor``
3333

@@ -36,35 +36,35 @@ You first create a ``SKLearnProcessor``
3636
from sagemaker.sklearn.processing import SKLearnProcessor
3737
3838
sklearn_processor = SKLearnProcessor(
39-
framework_version='0.20.0',
40-
role='[Your SageMaker-compatible IAM role]',
41-
instance_type='ml.m5.xlarge',
39+
framework_version="0.20.0",
40+
role="[Your SageMaker-compatible IAM role]",
41+
instance_type="ml.m5.xlarge",
4242
instance_count=1,
4343
)
4444
45-
Then you can run a Scikit-Learn script ``preprocessing.py`` in a processing job. In this example, our script takes one input from S3 and one command-line argument, processes the data, then splits the data into two datasets for output. When the job is finished, we can retrive the output from S3.
45+
Then you can run a scikit-learn script ``preprocessing.py`` in a processing job. In this example, our script takes one input from S3 and one command-line argument, processes the data, then splits the data into two datasets for output. When the job is finished, we can retrive the output from S3.
4646

4747
.. code:: python
4848
4949
from sagemaker.processing import ProcessingInput, ProcessingOutput
5050
5151
sklearn_processor.run(
52-
code='preprocessing.py',
52+
code="preprocessing.py",
5353
inputs=[
54-
ProcessingInput(source='s3://your-bucket/path/to/your/data, destination='/opt/ml/processing/input'),
54+
ProcessingInput(source="s3://your-bucket/path/to/your/data", destination="/opt/ml/processing/input"),
5555
],
5656
outputs=[
57-
ProcessingOutput(output_name='train_data', source='/opt/ml/processing/train'),
58-
ProcessingOutput(output_name='test_data', source='/opt/ml/processing/test'),
57+
ProcessingOutput(output_name="train_data", source="/opt/ml/processing/train"),
58+
ProcessingOutput(output_name="test_data", source="/opt/ml/processing/test"),
5959
],
60-
arguments=['--train-test-split-ratio', '0.2'],
60+
arguments=["--train-test-split-ratio", "0.2"],
6161
)
6262
6363
preprocessing_job_description = sklearn_processor.jobs[-1].describe()
6464
65-
For an in-depth look, please see the `Scikit-Learn Data Processing and Model Evaluation`_ example notebook.
65+
For an in-depth look, please see the `Scikit-learn Data Processing and Model Evaluation`_ example notebook.
6666

67-
.. _Scikit-Learn Data Processing and Model Evaluation: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_processing/scikit_learn_data_processing_and_model_evaluation/scikit_learn_data_processing_and_model_evaluation.ipynb
67+
.. _Scikit-learn Data Processing and Model Evaluation: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker_processing/scikit_learn_data_processing_and_model_evaluation/scikit_learn_data_processing_and_model_evaluation.ipynb
6868

6969

7070
Data Pre-Processing with Spark
@@ -79,26 +79,26 @@ This example shows how you can run a processing job inside of a container that c
7979
from sagemaker.processing import ScriptProcessor, ProcessingInput
8080
8181
spark_processor = ScriptProcessor(
82-
base_job_name='spark-preprocessor',
83-
image_uri='<ECR repository URI to your Spark processing image>',
84-
command=['/opt/program/submit'],
82+
base_job_name="spark-preprocessor",
83+
image_uri="<ECR repository URI to your Spark processing image>",
84+
command=["/opt/program/submit"],
8585
role=role,
8686
instance_count=2,
87-
instance_type='ml.r5.xlarge',
87+
instance_type="ml.r5.xlarge",
8888
max_runtime_in_seconds=1200,
89-
env={'mode': 'python'},
89+
env={"mode": "python"},
9090
)
9191
9292
spark_processor.run(
93-
code='preprocess.py',
93+
code="preprocess.py",
9494
arguments=[
95-
's3_input_bucket',
95+
"s3_input_bucket",
9696
bucket,
97-
's3_input_key_prefix',
97+
"s3_input_key_prefix",
9898
input_prefix,
99-
's3_output_bucket',
99+
"s3_output_bucket",
100100
bucket,
101-
's3_output_key_prefix',
101+
"s3_output_key_prefix",
102102
input_preprocessed_prefix,
103103
],
104104
logs=False,

src/sagemaker/processing.py

+3-4
Original file line numberDiff line numberDiff line change
@@ -269,9 +269,7 @@ def _normalize_outputs(self, outputs=None):
269269

270270

271271
class ScriptProcessor(Processor):
272-
"""Handles Amazon SageMaker processing tasks for jobs using a machine learning framework,
273-
which allows for providing a script to be run as part of the Processing Job.
274-
"""
272+
"""Handles Amazon SageMaker processing tasks for jobs using a machine learning framework."""
275273

276274
def __init__(
277275
self,
@@ -291,7 +289,8 @@ def __init__(
291289
network_config=None,
292290
):
293291
"""Initializes a ``ScriptProcessor`` instance. The ``ScriptProcessor``
294-
handles Amazon SageMaker Processing tasks for jobs using a machine learning framework.
292+
handles Amazon SageMaker Processing tasks for jobs using a machine learning framework,
293+
which allows for providing a script to be run as part of the Processing Job.
295294
296295
Args:
297296
role (str): An AWS IAM role name or ARN. Amazon SageMaker Processing

0 commit comments

Comments
 (0)