Skip to content

Commit b77eb48

Browse files
authored
Merge branch 'master' into remove-set-tag
2 parents da41bb3 + 1d84830 commit b77eb48

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+2285
-147
lines changed

CHANGELOG.md

+38
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,43 @@
11
# Changelog
22

3+
## v2.52.2.post0 (2021-08-11)
4+
5+
### Documentation Changes
6+
7+
* clarify that default_bucket creates a bucket
8+
* Minor updates to Clarify API documentation
9+
10+
## v2.52.2 (2021-08-10)
11+
12+
### Bug Fixes and Other Changes
13+
14+
* sklearn integ tests, remove swallowing exception on feature group delete attempt
15+
* sklearn integ test for custom bucket
16+
17+
### Documentation Changes
18+
19+
* Fix dataset_definition links
20+
* Document LambdaModel and LambdaPredictor classes
21+
22+
## v2.52.1 (2021-08-06)
23+
24+
### Bug Fixes and Other Changes
25+
26+
* revert #2251 changes for sklearn processor
27+
28+
## v2.52.0 (2021-08-05)
29+
30+
### Features
31+
32+
* processors that support multiple Python files, requirements.txt, and dependencies.
33+
* support step object in step depends on list
34+
35+
### Bug Fixes and Other Changes
36+
37+
* enable isolation while creating model from job
38+
* update `sagemaker.serverless` integration test
39+
* Use correct boto model name for RegisterModelStep properties
40+
341
## v2.51.0 (2021-08-03)
442

543
### Features

VERSION

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.51.1.dev0
1+
2.52.3.dev0

doc/api/inference/model.rst

+5
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,8 @@ Model
1515
:members:
1616
:undoc-members:
1717
:show-inheritance:
18+
19+
.. autoclass:: sagemaker.serverless.model.LambdaModel
20+
:members:
21+
:undoc-members:
22+
:show-inheritance:

doc/api/inference/predictors.rst

+5
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,8 @@ Make real-time predictions against SageMaker endpoints with Python objects
77
:members:
88
:undoc-members:
99
:show-inheritance:
10+
11+
.. autoclass:: sagemaker.serverless.predictor.LambdaPredictor
12+
:members:
13+
:undoc-members:
14+
:show-inheritance:

doc/overview.rst

+44
Original file line numberDiff line numberDiff line change
@@ -1063,6 +1063,50 @@ You can also find these notebooks in the **Advanced Functionality** section of t
10631063
For information about using sample notebooks in a SageMaker notebook instance, see `Use Example Notebooks <https://docs.aws.amazon.com/sagemaker/latest/dg/howitworks-nbexamples.html>`__
10641064
in the AWS documentation.
10651065
1066+
********************
1067+
Serverless Inference
1068+
********************
1069+
1070+
You can use the SageMaker Python SDK to perform serverless inference on Lambda.
1071+
1072+
To deploy models to Lambda, you must complete the following prerequisites:
1073+
1074+
- `Package your model and inference code as a container image. <https://docs.aws.amazon.com/lambda/latest/dg/images-create.html>`_
1075+
- `Create a role that lists Lambda as a trusted entity. <https://docs.aws.amazon.com/lambda/latest/dg/lambda-intro-execution-role.html#permissions-executionrole-console>`_
1076+
1077+
After completing the prerequisites, you can deploy your model to Lambda using
1078+
the `LambdaModel`_ class.
1079+
1080+
.. code:: python
1081+
1082+
from sagemaker.serverless import LambdaModel
1083+
1084+
image_uri = "123456789012.dkr.ecr.us-west-2.amazonaws.com/my-lambda-repository:latest"
1085+
role = "arn:aws:iam::123456789012:role/MyLambdaExecutionRole"
1086+
1087+
model = LambdaModel(image_uri=image_uri, role=role)
1088+
predictor = model.deploy("my-lambda-function", timeout=20, memory_size=4092)
1089+
1090+
The ``deploy`` method returns a `LambdaPredictor`_ instance. Use the
1091+
`LambdaPredictor`_ ``predict`` method to perform inference on Lambda.
1092+
1093+
.. code:: python
1094+
1095+
url = "https://example.com/cat.jpeg"
1096+
predictor.predict({"url": url}) # {'class': 'tabby'}
1097+
1098+
Once you are done performing inference on Lambda, free the `LambdaModel`_ and
1099+
`LambdaPredictor`_ resources using the ``delete_model`` and ``delete_predictor``
1100+
methods.
1101+
1102+
.. code:: python
1103+
1104+
model.delete_model()
1105+
predictor.delete_predictor()
1106+
1107+
.. _LambdaModel : https://sagemaker.readthedocs.io/en/stable/api/inference/model.html#sagemaker.serverless.model.LambdaModel
1108+
.. _LambdaPredictor : https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.serverless.predictor.LambdaPredictor
1109+
10661110
******************
10671111
SageMaker Workflow
10681112
******************

doc/workflows/pipelines/sagemaker.workflow.pipelines.rst

-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,6 @@ ConditionStep
55
-------------
66

77
.. autoclass:: sagemaker.workflow.condition_step.ConditionStep
8-
98
.. deprecated:: sagemaker.workflow.condition_step.JsonGet
109

1110
Conditions

src/sagemaker/clarify.py

+31-27
Original file line numberDiff line numberDiff line change
@@ -48,12 +48,17 @@ def __init__(
4848
headers (list[str]): A list of column names in the input dataset.
4949
features (str): JSONPath for locating the feature columns for bias metrics if the
5050
dataset format is JSONLines.
51-
dataset_type (str): Format of the dataset. Valid values are "text/csv" for CSV
52-
and "application/jsonlines" for JSONLines.
51+
dataset_type (str): Format of the dataset. Valid values are "text/csv" for CSV,
52+
"application/jsonlines" for JSONLines, and "application/x-parquet" for Parquet.
5353
s3_data_distribution_type (str): Valid options are "FullyReplicated" or
5454
"ShardedByS3Key".
5555
s3_compression_type (str): Valid options are "None" or "Gzip".
5656
"""
57+
if dataset_type not in ["text/csv", "application/jsonlines", "application/x-parquet"]:
58+
raise ValueError(
59+
f"Invalid dataset_type '{dataset_type}'."
60+
f" Please check the API documentation for the supported dataset types."
61+
)
5762
self.s3_data_input_path = s3_data_input_path
5863
self.s3_output_path = s3_output_path
5964
self.s3_data_distribution_type = s3_data_distribution_type
@@ -508,7 +513,7 @@ def run_pre_training_bias(
508513
kms_key=None,
509514
experiment_config=None,
510515
):
511-
"""Runs a ProcessingJob to compute the requested bias 'methods' of the input data.
516+
"""Runs a ProcessingJob to compute the pre-training bias methods of the input data.
512517
513518
Computes the requested methods that compare 'methods' (e.g. fraction of examples) for the
514519
sensitive group vs the other examples.
@@ -517,14 +522,14 @@ def run_pre_training_bias(
517522
data_config (:class:`~sagemaker.clarify.DataConfig`): Config of the input/output data.
518523
data_bias_config (:class:`~sagemaker.clarify.BiasConfig`): Config of sensitive groups.
519524
methods (str or list[str]): Selector of a subset of potential metrics:
520-
["`CI <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-ci.html>`_",
521-
"`DPL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-dpl.html>`_",
522-
"`KL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-kl.html>`_",
523-
"`JS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-js.html>`_",
524-
"`LP <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-lp.html>`_",
525-
"`TVD <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-tvd.html>`_",
526-
"`KS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-ks.html>`_",
527-
"`CDDL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-cdd.html>`_"].
525+
["`CI <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-bias-metric-class-imbalance.html>`_",
526+
"`DPL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-true-label-imbalance.html>`_",
527+
"`KL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-kl-divergence.html>`_",
528+
"`JS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-jensen-shannon-divergence.html>`_",
529+
"`LP <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-lp-norm.html>`_",
530+
"`TVD <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-total-variation-distance.html>`_",
531+
"`KS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-kolmogorov-smirnov.html>`_",
532+
"`CDDL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-cddl.html>`_"].
528533
Defaults to computing all.
529534
wait (bool): Whether the call should wait until the job completes (default: True).
530535
logs (bool): Whether to show the logs produced by the job.
@@ -538,7 +543,7 @@ def run_pre_training_bias(
538543
experiment_config (dict[str, str]): Experiment management configuration.
539544
Dictionary contains three optional keys:
540545
'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
541-
"""
546+
""" # noqa E501
542547
analysis_config = data_config.get_config()
543548
analysis_config.update(data_bias_config.get_config())
544549
analysis_config["methods"] = {"pre_training_bias": {"methods": methods}}
@@ -562,7 +567,7 @@ def run_post_training_bias(
562567
kms_key=None,
563568
experiment_config=None,
564569
):
565-
"""Runs a ProcessingJob to compute the requested bias 'methods' of the model predictions.
570+
"""Runs a ProcessingJob to compute the post-training bias methods of the model predictions.
566571
567572
Spins up a model endpoint, runs inference over the input example in the
568573
's3_data_input_path' to obtain predicted labels. Computes a the requested methods that
@@ -633,12 +638,11 @@ def run_bias(
633638
kms_key=None,
634639
experiment_config=None,
635640
):
636-
"""Runs a ProcessingJob to compute the requested bias 'methods' of the model predictions.
641+
"""Runs a ProcessingJob to compute the requested bias methods.
637642
638-
Spins up a model endpoint, runs inference over the input example in the
639-
's3_data_input_path' to obtain predicted labels. Computes a the requested methods that
640-
compare 'methods' (e.g. accuracy, precision, recall) for the sensitive group vs the other
641-
examples.
643+
It computes the metrics of both the pre-training methods and the post-training methods.
644+
To calculate post-training methods, it needs to spin up a model endpoint, runs inference
645+
over the input example in the 's3_data_input_path' to obtain predicted labels.
642646
643647
Args:
644648
data_config (:class:`~sagemaker.clarify.DataConfig`): Config of the input/output data.
@@ -648,14 +652,14 @@ def run_bias(
648652
model_predicted_label_config (:class:`~sagemaker.clarify.ModelPredictedLabelConfig`):
649653
Config of how to extract the predicted label from the model output.
650654
pre_training_methods (str or list[str]): Selector of a subset of potential metrics:
651-
["`CI <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-ci.html>`_",
652-
"`DPL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-dpl.html>`_",
653-
"`KL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-kl.html>`_",
654-
"`JS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-js.html>`_",
655-
"`LP <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-lp.html>`_",
656-
"`TVD <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-tvd.html>`_",
657-
"`KS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-ks.html>`_",
658-
"`CDDL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-cdd.html>`_"].
655+
["`CI <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-bias-metric-class-imbalance.html>`_",
656+
"`DPL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-true-label-imbalance.html>`_",
657+
"`KL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-kl-divergence.html>`_",
658+
"`JS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-jensen-shannon-divergence.html>`_",
659+
"`LP <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-lp-norm.html>`_",
660+
"`TVD <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-total-variation-distance.html>`_",
661+
"`KS <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-kolmogorov-smirnov.html>`_",
662+
"`CDDL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-data-bias-metric-cddl.html>`_"].
659663
Defaults to computing all.
660664
post_training_methods (str or list[str]): Selector of a subset of potential metrics:
661665
["`DPPL <https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-post-training-bias-metric-dppl.html>`_"
@@ -682,7 +686,7 @@ def run_bias(
682686
experiment_config (dict[str, str]): Experiment management configuration.
683687
Dictionary contains three optional keys:
684688
'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
685-
"""
689+
""" # noqa E501
686690
analysis_config = data_config.get_config()
687691
analysis_config.update(bias_config.get_config())
688692
analysis_config["predictor"] = model_config.get_predictor_config()

src/sagemaker/dataset_definition/inputs.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -99,9 +99,9 @@ class DatasetDefinition(ApiObject):
9999
Definition inputs to run a processing job. LocalPath is an absolute path to the input
100100
data. This is a required parameter when `AppManaged` is False (default).
101101
redshift_dataset_definition
102-
(:class:`~sagemaker.dataset_definition.RedshiftDatasetDefinition`): Redshift
102+
(:class:`~sagemaker.dataset_definition.inputs.RedshiftDatasetDefinition`): Redshift
103103
dataset definition.
104-
athena_dataset_definition (:class:`~sagemaker.dataset_definition.AthenaDatasetDefinition`):
104+
athena_dataset_definition (:class:`~sagemaker.dataset_definition.inputs.AthenaDatasetDefinition`):
105105
Configuration for Athena Dataset Definition input.
106106
"""
107107

src/sagemaker/huggingface/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@
1515

1616
from sagemaker.huggingface.estimator import HuggingFace # noqa: F401
1717
from sagemaker.huggingface.model import HuggingFaceModel, HuggingFacePredictor # noqa: F401
18+
from sagemaker.huggingface.processing import HuggingFaceProcessor # noqa:F401
+132
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License"). You
4+
# may not use this file except in compliance with the License. A copy of
5+
# the License is located at
6+
#
7+
# http://aws.amazon.com/apache2.0/
8+
#
9+
# or in the "license" file accompanying this file. This file is
10+
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
11+
# ANY KIND, either express or implied. See the License for the specific
12+
# language governing permissions and limitations under the License.
13+
"""This module contains code related to HuggingFace Processors which are used for Processing jobs.
14+
15+
These jobs let customers perform data pre-processing, post-processing, feature engineering,
16+
data validation, and model evaluation and interpretation on SageMaker.
17+
"""
18+
from __future__ import absolute_import
19+
20+
from sagemaker.processing import FrameworkProcessor
21+
from sagemaker.huggingface.estimator import HuggingFace
22+
23+
24+
class HuggingFaceProcessor(FrameworkProcessor):
25+
"""Handles Amazon SageMaker processing tasks for jobs using HuggingFace containers."""
26+
27+
estimator_cls = HuggingFace
28+
29+
def __init__(
30+
self,
31+
role,
32+
instance_count,
33+
instance_type,
34+
transformers_version=None,
35+
tensorflow_version=None,
36+
pytorch_version=None,
37+
py_version="py36",
38+
image_uri=None,
39+
command=None,
40+
volume_size_in_gb=30,
41+
volume_kms_key=None,
42+
output_kms_key=None,
43+
code_location=None,
44+
max_runtime_in_seconds=None,
45+
base_job_name=None,
46+
sagemaker_session=None,
47+
env=None,
48+
tags=None,
49+
network_config=None,
50+
):
51+
"""This processor executes a Python script in a HuggingFace execution environment.
52+
53+
Unless ``image_uri`` is specified, the environment is an Amazon-built Docker container
54+
that executes functions defined in the supplied ``code`` Python script.
55+
56+
The arguments have the same meaning as in ``FrameworkProcessor``, with the following
57+
exceptions.
58+
59+
Args:
60+
transformers_version (str): Transformers version you want to use for
61+
executing your model training code. Defaults to ``None``. Required unless
62+
``image_uri`` is provided. The current supported version is ``4.4.2``.
63+
tensorflow_version (str): TensorFlow version you want to use for
64+
executing your model training code. Defaults to ``None``. Required unless
65+
``pytorch_version`` is provided. The current supported version is ``1.6.0``.
66+
pytorch_version (str): PyTorch version you want to use for
67+
executing your model training code. Defaults to ``None``. Required unless
68+
``tensorflow_version`` is provided. The current supported version is ``2.4.1``.
69+
py_version (str): Python version you want to use for executing your model training
70+
code. Defaults to ``None``. Required unless ``image_uri`` is provided. If
71+
using PyTorch, the current supported version is ``py36``. If using TensorFlow,
72+
the current supported version is ``py37``.
73+
74+
.. tip::
75+
76+
You can find additional parameters for initializing this class at
77+
:class:`~sagemaker.processing.FrameworkProcessor`.
78+
"""
79+
self.pytorch_version = pytorch_version
80+
self.tensorflow_version = tensorflow_version
81+
super().__init__(
82+
self.estimator_cls,
83+
transformers_version,
84+
role,
85+
instance_count,
86+
instance_type,
87+
py_version,
88+
image_uri,
89+
command,
90+
volume_size_in_gb,
91+
volume_kms_key,
92+
output_kms_key,
93+
code_location,
94+
max_runtime_in_seconds,
95+
base_job_name,
96+
sagemaker_session,
97+
env,
98+
tags,
99+
network_config,
100+
)
101+
102+
def _create_estimator(
103+
self,
104+
entry_point="",
105+
source_dir=None,
106+
dependencies=None,
107+
git_config=None,
108+
):
109+
"""Override default estimator factory function for HuggingFace's different parameters
110+
111+
HuggingFace estimators have 3 framework version parameters instead of one: The version for
112+
Transformers, PyTorch, and TensorFlow.
113+
"""
114+
return self.estimator_cls(
115+
transformers_version=self.framework_version,
116+
tensorflow_version=self.tensorflow_version,
117+
pytorch_version=self.pytorch_version,
118+
py_version=self.py_version,
119+
entry_point=entry_point,
120+
source_dir=source_dir,
121+
dependencies=dependencies,
122+
git_config=git_config,
123+
code_location=self.code_location,
124+
enable_network_isolation=False,
125+
image_uri=self.image_uri,
126+
role=self.role,
127+
instance_count=self.instance_count,
128+
instance_type=self.instance_type,
129+
sagemaker_session=self.sagemaker_session,
130+
debugger_hook_config=False,
131+
disable_profiler=True,
132+
)

0 commit comments

Comments
 (0)