Skip to content

support TF2.10.1 training DLC #3506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 50 commits into from
Closed
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
c97c467
fix: type hint of PySparkProcessor __init__ (#3297)
NivekNey Dec 2, 2022
de58941
fix: fix PySparkProcessor __init__ params type (#3354)
andre-marcos-perez Dec 2, 2022
41dd330
fix: Allow Py 3.7 for MMS Test Docker env (#3080)
shreyapandit Dec 2, 2022
1e23a3f
refactoring : using with statement (#3286)
maldil Dec 2, 2022
19efadf
Update local_requirements.txt PyYAML version (#3095)
shreyapandit Dec 2, 2022
76f7782
feature: Update TF 2.9 and TF 2.10 inference DLCs (#3465)
arjkesh Dec 2, 2022
fde0738
feature: Added transform with monitoring pipeline step in transformer…
keshav-chandak Dec 2, 2022
7f9f3b0
fix: Fix bug forcing uploaded tar to be named sourcedir (#3412)
claytonparnell Dec 2, 2022
5d59767
feature: Add Code Owners file (#3503)
navinsoni Dec 2, 2022
0f5cf18
prepare release v2.119.0
Dec 3, 2022
f1f0013
update development version to v2.119.1.dev0
Dec 3, 2022
bb4b689
feature: Add DXB region to frameworks by DLC (#3387)
RadhikaB-97 Dec 5, 2022
b68bcd9
fix: support idempotency for framework and spark processors (#3460)
brockwade633 Dec 5, 2022
2b3d2b0
support TF2.10.1 training DLC
tejaschumbalkar Dec 5, 2022
32969da
feature: Update registries with new region account number mappings. (…
kenny-ezirim Dec 6, 2022
767da0a
feature: Adding support for SageMaker Training Compiler in PyTorch es…
Lokiiiiii Dec 7, 2022
d779d1b
feature: Add Neo image uri config for Pytorch 1.12 (#3507)
HappyAmazonian Dec 7, 2022
83327fb
prepare release v2.120.0
Dec 7, 2022
5bffb04
update development version to v2.120.1.dev0
Dec 7, 2022
b828396
feature: Algorithms Region Expansion OSU/DXB (#3508)
malav-shastri Dec 7, 2022
357f732
fix: Add constraints file for apache-airflow (#3510)
navinsoni Dec 7, 2022
a28d1dd
fix: FrameworkProcessor S3 uploads (#3493)
brockwade633 Dec 8, 2022
11d2475
prepare release v2.121.0
Dec 8, 2022
24171b5
update development version to v2.121.1.dev0
Dec 8, 2022
d5847d5
Fix: Differentiate SageMaker Training Compiler's PT DLCs from base PT…
Lokiiiiii Dec 8, 2022
3f6ea88
fix: Fix failing jumpstart cache unit tests (#3514)
evakravi Dec 8, 2022
4570aa6
fix: Pop out ModelPackageName from pipeline definition (#3472)
qidewenwhen Dec 9, 2022
959ea1a
prepare release v2.121.1
Dec 9, 2022
b2e8b66
update development version to v2.121.2.dev0
Dec 9, 2022
355975d
fix: Skip Bad Transform Test (#3521)
amzn-choeric Dec 9, 2022
fadc817
fix: Revert "fix: type hint of PySparkProcessor __init__" (#3524)
mufaddal-rohawala Dec 9, 2022
c5fc93f
change: Update for Tensorflow Serving 2.11 inference DLCs (#3509)
hballuru Dec 9, 2022
ec8da98
prepare release v2.121.2
Dec 12, 2022
0352122
update development version to v2.121.3.dev0
Dec 12, 2022
d6c0214
feature: Add OSU region to frameworks for DLC (#3532)
kace Dec 12, 2022
5af4feb
fix: Remove content type image/jpg from analysis configuration schema…
xgchena Dec 12, 2022
4389847
fix: unpin packaging version (#3533)
claytonparnell Dec 13, 2022
a3efddf
fix: the Hyperband support fix for the HPO (#3516)
repushko Dec 13, 2022
bd96ec5
feature: Feature Store dataset builder, delete_record, get_record, li…
mizanfiu Dec 14, 2022
fb3880f
prepare release v2.122.0
Dec 14, 2022
a584ea5
update development version to v2.122.1.dev0
Dec 14, 2022
93a8466
feature: Add SageMaker Experiment (#3536)
qidewenwhen Dec 14, 2022
127de9a
support TF2.10.1 training DLC
tejaschumbalkar Dec 5, 2022
1cbfc83
feature: Add support for TF2.9.2 training images (#3178)
tejaschumbalkar Dec 14, 2022
389d78f
black format
tejaschumbalkar Dec 14, 2022
906791b
Merge branch 'tf2.10.1-release' of https://github.com/tejaschumbalkar…
tejaschumbalkar Dec 14, 2022
a4dd518
Merge branch 'master' into tf2.10.1-release
tejaschumbalkar Dec 14, 2022
4f507f7
Merge 'upstream/master' into tf2.10.1-release
tejaschumbalkar Jan 3, 2023
78ee819
revert unwanted changes
tejaschumbalkar Jan 3, 2023
920c2a3
add tf2.10.0 config
tejaschumbalkar Jan 3, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,6 @@ env/
.vscode/
**/tmp
.python-version
**/_repack_model.py
**/_repack_script_launcher.sh
**/_repack_script_launcher.sh
tests/data/**/_repack_model.py
tests/data/experiment/sagemaker-dev-1.0.tar.gz
80 changes: 80 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,85 @@
# Changelog

## v2.122.0 (2022-12-14)

### Features

* Feature Store dataset builder, delete_record, get_record, list_feature_group
* Add OSU region to frameworks for DLC

### Bug Fixes and Other Changes

* the Hyperband support fix for the HPO
* unpin packaging version
* Remove content type image/jpg from analysis configuration schema

## v2.121.2 (2022-12-12)

### Bug Fixes and Other Changes

* Update for Tensorflow Serving 2.11 inference DLCs
* Revert "fix: type hint of PySparkProcessor __init__"
* Skip Bad Transform Test

## v2.121.1 (2022-12-09)

### Bug Fixes and Other Changes

* Pop out ModelPackageName from pipeline definition
* Fix failing jumpstart cache unit tests

## v2.121.0 (2022-12-08)

### Features

* Algorithms Region Expansion OSU/DXB

### Bug Fixes and Other Changes

* FrameworkProcessor S3 uploads
* Add constraints file for apache-airflow

## v2.120.0 (2022-12-07)

### Features

* Add Neo image uri config for Pytorch 1.12
* Adding support for SageMaker Training Compiler in PyTorch estimator starting 1.12
* Update registries with new region account number mappings.
* Add DXB region to frameworks by DLC

### Bug Fixes and Other Changes

* support idempotency for framework and spark processors

## v2.119.0 (2022-12-03)

### Features

* Add Code Owners file
* Added transform with monitoring pipeline step in transformer
* Update TF 2.9 and TF 2.10 inference DLCs
* make estimator accept json file as modelparallel config
* SageMaker Training Compiler does not support p4de instances
* Add support for SparkML v3.3

### Bug Fixes and Other Changes

* Fix bug forcing uploaded tar to be named sourcedir
* Update local_requirements.txt PyYAML version
* refactoring : using with statement
* Allow Py 3.7 for MMS Test Docker env
* fix PySparkProcessor __init__ params type
* type hint of PySparkProcessor __init__
* Return ARM XGB/SKLearn tags if `image_scope` is `inference_graviton`
* Update scipy to 1.7.3 to support M1 development envs
* Fixing type hints for Spark processor that has instance type/count params in reverse order
* Add DeepAR ap-northeast-3 repository.
* Fix AsyncInferenceConfig documentation typo
* fix ml_inf to ml_inf1 in Neo multi-version support
* Fix type annotations
* add neo mvp region accounts

## v2.118.0 (2022-12-01)

### Features
Expand Down
1 change: 1 addition & 0 deletions CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @aws/sagemaker-ml-frameworks
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.118.1.dev0
2.122.1.dev0
10 changes: 10 additions & 0 deletions doc/experiments/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
############################
Amazon SageMaker Experiments
############################

The SageMaker Python SDK supports to track and organize your machine learning workflow across SageMaker with jobs, such as Processing, Training and Transform, or locally.

.. toctree::
:maxdepth: 2

sagemaker.experiments
20 changes: 20 additions & 0 deletions doc/experiments/sagemaker.experiments.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Experiments
============

Run
-------------

.. autoclass:: sagemaker.experiments.Run
:members:

.. automethod:: sagemaker.experiments.load_run

.. automethod:: sagemaker.experiments.list_runs

.. autoclass:: sagemaker.experiments.SortByType
:members:
:undoc-members:

.. autoclass:: sagemaker.experiments.SortOrderType
:members:
:undoc-members:
10 changes: 10 additions & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,16 @@ Orchestrate your SageMaker training and inference workflows with Airflow and Kub
workflows/index


****************************
Amazon SageMaker Experiments
****************************
You can use Amazon SageMaker Experiments to track machine learning experiments.

.. toctree::
:maxdepth: 2

experiments/index

*************************
Amazon SageMaker Debugger
*************************
Expand Down
2 changes: 2 additions & 0 deletions requirements/extras/test_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ contextlib2==21.6.0
awslogs==0.14.0
black==22.3.0
stopit==1.1.2
# Update tox.ini to have correct version of airflow constraints file
apache-airflow==2.4.1
apache-airflow-providers-amazon==4.0.0
attrs==22.1.0
Expand All @@ -19,3 +20,4 @@ requests==2.27.1
sagemaker-experiments==0.1.35
Jinja2==3.0.3
pandas>=1.3.5,<1.5
scikit-learn==1.0.2
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def read_requirements(filename):
# Declare minimal set for installation
required_packages = [
"attrs>=20.3.0,<23",
"boto3>=1.26.20,<2.0",
"boto3>=1.26.28,<2.0",
"google-pasta",
"numpy>=1.9.0,<2.0",
"protobuf>=3.1,<4.0",
Expand Down
7 changes: 4 additions & 3 deletions src/sagemaker/amazon/amazon_estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
from sagemaker.deprecations import renamed_warning
from sagemaker.estimator import EstimatorBase, _TrainingJob
from sagemaker.inputs import FileSystemInput, TrainingInput
from sagemaker.utils import sagemaker_timestamp
from sagemaker.utils import sagemaker_timestamp, check_and_get_run_experiment_config
from sagemaker.workflow.entities import PipelineVariable
from sagemaker.workflow.pipeline_context import runnable_by_pipeline
from sagemaker.workflow import is_pipeline_variable
Expand Down Expand Up @@ -242,8 +242,8 @@ def fit(
generates a default job name, based on the training image name
and current timestamp.
experiment_config (dict[str, str]): Experiment management configuration.
Optionally, the dict can contain three keys:
'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
Optionally, the dict can contain four keys:
'ExperimentName', 'TrialName', 'TrialComponentDisplayName' and 'RunName'.
The behavior of setting these keys is as follows:
* If `ExperimentName` is supplied but `TrialName` is not a Trial will be
automatically created and the job's Trial Component associated with the Trial.
Expand All @@ -255,6 +255,7 @@ def fit(
"""
self._prepare_for_training(records, job_name=job_name, mini_batch_size=mini_batch_size)

experiment_config = check_and_get_run_experiment_config(experiment_config)
self.latest_training_job = _TrainingJob.start_new(
self, records, experiment_config=experiment_config
)
Expand Down
6 changes: 4 additions & 2 deletions src/sagemaker/apiutils/_base_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,10 @@ def _search(
search_items = search_method_response.get("Results", [])
next_token = search_method_response.get(boto_next_token_name)
for item in search_items:
if cls.__name__ in item:
yield search_item_factory(item[cls.__name__])
# _TrialComponent class in experiments module is not public currently
class_name = cls.__name__.lstrip("_")
if class_name in item:
yield search_item_factory(item[class_name])
if not next_token:
break
except StopIteration:
Expand Down
4 changes: 3 additions & 1 deletion src/sagemaker/apiutils/_boto_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,9 @@ def from_boto(boto_dict, boto_name_to_member_name, member_name_to_type):
api_type, is_collection = member_name_to_type[member_name]
if is_collection:
if isinstance(boto_value, dict):
member_value = api_type.from_boto(boto_value)
member_value = {
key: api_type.from_boto(value) for key, value in boto_value.items()
}
else:
member_value = [api_type.from_boto(item) for item in boto_value]
else:
Expand Down
1 change: 0 additions & 1 deletion src/sagemaker/clarify.py
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,6 @@
"text/csv",
"application/jsonlines",
"image/jpeg",
"image/jpg",
"image/png",
"application/x-npy",
),
Expand Down
6 changes: 4 additions & 2 deletions src/sagemaker/dataset_definition/inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,8 +124,10 @@ class DatasetDefinition(ApiObject):
"""DatasetDefinition input."""

_custom_boto_types = {
"redshift_dataset_definition": (RedshiftDatasetDefinition, True),
"athena_dataset_definition": (AthenaDatasetDefinition, True),
# RedshiftDatasetDefinition and AthenaDatasetDefinition are not collection
# Instead they are singleton objects. Thus, set the is_collection flag to False.
"redshift_dataset_definition": (RedshiftDatasetDefinition, False),
"athena_dataset_definition": (AthenaDatasetDefinition, False),
}

def __init__(
Expand Down
16 changes: 10 additions & 6 deletions src/sagemaker/estimator.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@
get_config_value,
name_from_base,
to_string,
check_and_get_run_experiment_config,
)
from sagemaker.workflow import is_pipeline_variable
from sagemaker.workflow.entities import PipelineVariable
Expand Down Expand Up @@ -1103,8 +1104,8 @@ def fit(
job_name (str): Training job name. If not specified, the estimator generates
a default job name based on the training image name and current timestamp.
experiment_config (dict[str, str]): Experiment management configuration.
Optionally, the dict can contain three keys:
'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
Optionally, the dict can contain four keys:
'ExperimentName', 'TrialName', 'TrialComponentDisplayName' and 'RunName'..
The behavior of setting these keys is as follows:
* If `ExperimentName` is supplied but `TrialName` is not a Trial will be
automatically created and the job's Trial Component associated with the Trial.
Expand All @@ -1122,6 +1123,7 @@ def fit(
"""
self._prepare_for_training(job_name=job_name)

experiment_config = check_and_get_run_experiment_config(experiment_config)
self.latest_training_job = _TrainingJob.start_new(self, inputs, experiment_config)
self.jobs.append(self.latest_training_job)
if wait:
Expand Down Expand Up @@ -2023,8 +2025,8 @@ def start_new(cls, estimator, inputs, experiment_config):
inputs (str): Parameters used when called
:meth:`~sagemaker.estimator.EstimatorBase.fit`.
experiment_config (dict[str, str]): Experiment management configuration.
Optionally, the dict can contain three keys:
'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
Optionally, the dict can contain four keys:
'ExperimentName', 'TrialName', 'TrialComponentDisplayName' and 'RunName'.
The behavior of setting these keys is as follows:
* If `ExperimentName` is supplied but `TrialName` is not a Trial will be
automatically created and the job's Trial Component associated with the Trial.
Expand All @@ -2033,6 +2035,7 @@ def start_new(cls, estimator, inputs, experiment_config):
* If both `ExperimentName` and `TrialName` are not supplied the trial component
will be unassociated.
* `TrialComponentDisplayName` is used for display in Studio.
* `RunName` is used to record an experiment run.
Returns:
sagemaker.estimator._TrainingJob: Constructed object that captures
all information about the started training job.
Expand All @@ -2053,8 +2056,8 @@ def _get_train_args(cls, estimator, inputs, experiment_config):
inputs (str): Parameters used when called
:meth:`~sagemaker.estimator.EstimatorBase.fit`.
experiment_config (dict[str, str]): Experiment management configuration.
Optionally, the dict can contain three keys:
'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
Optionally, the dict can contain four keys:
'ExperimentName', 'TrialName', 'TrialComponentDisplayName' and 'RunName'.
The behavior of setting these keys is as follows:
* If `ExperimentName` is supplied but `TrialName` is not a Trial will be
automatically created and the job's Trial Component associated with the Trial.
Expand All @@ -2063,6 +2066,7 @@ def _get_train_args(cls, estimator, inputs, experiment_config):
* If both `ExperimentName` and `TrialName` are not supplied the trial component
will be unassociated.
* `TrialComponentDisplayName` is used for display in Studio.
* `RunName` is used to record an experiment run.

Returns:
Dict: dict for `sagemaker.session.Session.train` method
Expand Down
20 changes: 20 additions & 0 deletions src/sagemaker/experiments/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
# http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
"""Sagemaker Experiment Module"""
from __future__ import absolute_import

from sagemaker.experiments.run import Run # noqa: F401
from sagemaker.experiments.run import load_run # noqa: F401
from sagemaker.experiments.run import list_runs # noqa: F401
from sagemaker.experiments.run import SortOrderType # noqa: F401
from sagemaker.experiments.run import SortByType # noqa: F401
Loading