Skip to content

Commit f912edf

Browse files
committed
Merge remote-tracking branch 'origin' into feat/local-download-dir
2 parents d314b67 + 554952e commit f912edf

File tree

122 files changed

+13035
-1168
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

122 files changed

+13035
-1168
lines changed

.gitignore

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,5 +30,6 @@ env/
3030
.vscode/
3131
**/tmp
3232
.python-version
33-
**/_repack_model.py
34-
**/_repack_script_launcher.sh
33+
**/_repack_script_launcher.sh
34+
tests/data/**/_repack_model.py
35+
tests/data/experiment/sagemaker-dev-1.0.tar.gz

CHANGELOG.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,67 @@
11
# Changelog
22

3+
## v2.126.0 (2022-12-22)
4+
5+
### Features
6+
7+
* AutoGluon 0.6.1 image_uris
8+
9+
### Bug Fixes and Other Changes
10+
11+
* Fix broken link in doc
12+
* Do not specify S3 path for disabled profiler
13+
14+
### Documentation Changes
15+
16+
* fix the incorrect property reference
17+
18+
## v2.125.0 (2022-12-19)
19+
20+
### Features
21+
22+
* add RandomSeed to support reproducible HPO
23+
24+
### Bug Fixes and Other Changes
25+
26+
* Correct SageMaker Clarify API docstrings by changing JSONPath to JMESPath
27+
28+
## v2.124.0 (2022-12-16)
29+
30+
### Features
31+
32+
* Doc update for TableFormatEnum
33+
* Add p4de to smddp supported instance types
34+
* Add disable_profiler field in config and propagate changes
35+
* Added doc update for dataset builder
36+
37+
### Bug Fixes and Other Changes
38+
39+
* Use Async Inference Config when available for endpoint update
40+
41+
### Documentation Changes
42+
43+
* smdistributed libraries release notes
44+
45+
## v2.123.0 (2022-12-15)
46+
47+
### Features
48+
49+
* Add support for TF2.9.2 training images
50+
* Add SageMaker Experiment
51+
52+
## v2.122.0 (2022-12-14)
53+
54+
### Features
55+
56+
* Feature Store dataset builder, delete_record, get_record, list_feature_group
57+
* Add OSU region to frameworks for DLC
58+
59+
### Bug Fixes and Other Changes
60+
61+
* the Hyperband support fix for the HPO
62+
* unpin packaging version
63+
* Remove content type image/jpg from analysis configuration schema
64+
365
## v2.121.2 (2022-12-12)
466

567
### Bug Fixes and Other Changes

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
2.121.3.dev0
1+
2.126.1.dev0

coverage.xml

Whitespace-only changes.

doc/amazon_sagemaker_model_building_pipeline.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -453,7 +453,7 @@ Example:
453453
str_outputParam, int_outputParam, bool_outputParam, float_outputParam
454454
],
455455
)
456-
output_ref = step_lambda.OutputParameters["output1"]
456+
output_ref = step_lambda.properties.Outputs["output1"]
457457
458458
Where the lambda function with :code:`arn arn:aws:lambda:us-west-2:123456789012:function:sagemaker_test_lambda`
459459
should output like this:
@@ -479,7 +479,7 @@ Note that the output parameters can not be nested. Otherwise, the value will be
479479
}
480480
}
481481
482-
This will be resolved as :code:`{"output1": "{\"nested_output1\":\"my-output\"}"}` by which if you refer :code:`step_lambda.OutputParameters["output1"]["nested_output1"]` later, a non-retryable client error will be thrown.
482+
This will be resolved as :code:`{"output1": "{\"nested_output1\":\"my-output\"}"}` by which if you refer :code:`step_lambda.properties.Outputs["output1"]["nested_output1"]` later, a non-retryable client error will be thrown.
483483

484484
CallbackStep
485485
`````````````
@@ -503,7 +503,7 @@ Example:
503503
inputs={"arg1": "foo", "arg2": 5, "arg3": param},
504504
outputs=[outputParam],
505505
)
506-
output_ref = step_callback.OutputParameters["output1]
506+
output_ref = step_callback.properties.Outputs["output1]
507507
508508
The output parameters cannot be nested. If the values are nested, they will be treated as a single string value. For example, a nested output value of
509509
@@ -515,7 +515,7 @@ The output parameters cannot be nested. If the values are nested, they will be t
515515
}
516516
}
517517
518-
is resolved as :code:`{"output1": "{\"nested_output1\":\"my-output\"}"}`. If you try to refer to :code:`step_callback.OutputParameters["output1"]["nested_output1"]` this will throw a non-retryable client error.
518+
is resolved as :code:`{"output1": "{\"nested_output1\":\"my-output\"}"}`. If you try to refer to :code:`step_callback.properties.Outputs["output1"]["nested_output1"]` this will throw a non-retryable client error.
519519
520520
521521
QualityCheckStep

doc/api/prep_data/feature_store.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,3 +72,15 @@ Inputs
7272
.. autoclass:: sagemaker.feature_store.inputs.FeatureValue
7373
:members:
7474
:show-inheritance:
75+
76+
.. autoclass:: sagemaker.feature_store.inputs.TableFormatEnum
77+
:members:
78+
:show-inheritance:
79+
80+
81+
Dataset Builder
82+
***************
83+
84+
.. autoclass:: sagemaker.feature_store.dataset_builder.DatasetBuilder
85+
:members:
86+
:show-inheritance:

doc/api/training/sdp_versions/latest.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ depending on the version of the library you use.
2626
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-use-python-skd-api>`_
2727
for more information.
2828

29-
Version 1.4.0, 1.4.1, 1.5.0 (Latest)
30-
====================================
29+
Version 1.4.0, 1.4.1, 1.5.0, 1.6.0 (Latest)
30+
===========================================
3131

3232
.. toctree::
3333
:maxdepth: 1

doc/api/training/smd_data_parallel_release_notes/smd_data_parallel_change_log.rst

Lines changed: 43 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,51 @@ Release Notes
77
New features, bug fixes, and improvements are regularly made to the SageMaker
88
distributed data parallel library.
99

10-
SageMaker Distributed Data Parallel 1.5.0 Release Notes
10+
SageMaker Distributed Data Parallel 1.6.0 Release Notes
1111
=======================================================
1212

13+
*Date: Dec. 15. 2022*
14+
15+
**New Features**
16+
17+
* New optimized SMDDP AllGather collective to complement the sharded data parallelism technique
18+
in the SageMaker model parallelism library. For more information, see `Sharded data parallelism with SMDDP Collectives
19+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_
20+
in the *Amazon SageMaker Developer Guide*.
21+
* Added support for Amazon EC2 ``ml.p4de.24xlarge`` instances. You can run data parallel training jobs
22+
on ``ml.p4de.24xlarge`` instances with the SageMaker data parallelism library’s AllReduce collective.
23+
24+
**Improvements**
25+
26+
* General performance improvements of the SMDDP AllReduce collective communication operation.
27+
28+
**Migration to AWS Deep Learning Containers**
29+
30+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
31+
32+
- SageMaker training container for PyTorch v1.12.1
33+
34+
.. code::
35+
36+
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker
37+
38+
39+
Binary file of this version of the library for `custom container
40+
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-bring-your-own-container>`_ users:
41+
42+
.. code::
43+
44+
https://smdataparallel.s3.amazonaws.com/binary/pytorch/1.12.1/cu113/2022-12-05/smdistributed_dataparallel-1.6.0-cp38-cp38-linux_x86_64.whl
45+
46+
47+
----
48+
49+
Release History
50+
===============
51+
52+
SageMaker Distributed Data Parallel 1.5.0 Release Notes
53+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
54+
1355
*Date: Jul. 26. 2022*
1456

1557
**Currency Updates**
@@ -38,12 +80,6 @@ Binary file of this version of the library for `custom container
3880
3981
https://smdataparallel.s3.amazonaws.com/binary/pytorch/1.12.0/cu113/2022-07-01/smdistributed_dataparallel-1.5.0-cp38-cp38-linux_x86_64.whl
4082
41-
42-
----
43-
44-
Release History
45-
===============
46-
4783
SageMaker Distributed Data Parallel 1.4.1 Release Notes
4884
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4985

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

Lines changed: 53 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,60 @@ New features, bug fixes, and improvements are regularly made to the SageMaker
66
distributed model parallel library.
77

88

9-
SageMaker Distributed Model Parallel 1.11.0 Release Notes
9+
SageMaker Distributed Model Parallel 1.13.0 Release Notes
1010
=========================================================
1111

12+
*Date: Dec. 15. 2022*
13+
14+
**New Features**
15+
16+
* Sharded data parallelism now supports a new backend for collectives called *SMDDP Collectives*.
17+
For supported scenarios, SMDDP Collectives are on by default for the AllGather operation.
18+
For more information, see
19+
`Sharded data parallelism with SMDDP Collectives
20+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_
21+
in the *Amazon SageMaker Developer Guide*.
22+
* Introduced FlashAttention for DistributedTransformer to improve memory usage and computational
23+
performance of models such as GPT2, GPTNeo, GPTJ, GPTNeoX, BERT, and RoBERTa.
24+
25+
**Bug Fixes**
26+
27+
* Fixed initialization of ``lm_head`` in DistributedTransformer to use a provided range
28+
for initialization, when weights are not tied with the embeddings.
29+
30+
**Improvements**
31+
32+
* When a module has no parameters, we have introduced an optimization to execute
33+
such a module on the same rank as its parent during pipeline parallelism.
34+
35+
**Migration to AWS Deep Learning Containers**
36+
37+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
38+
39+
- SageMaker training container for PyTorch v1.12.1
40+
41+
.. code::
42+
43+
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker
44+
45+
46+
Binary file of this version of the library for `custom container
47+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-sm-sdk.html#model-parallel-bring-your-own-container>`_ users:
48+
49+
- For PyTorch 1.12.0
50+
51+
.. code::
52+
53+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.12.1/build-artifacts/2022-12-08-21-34/smdistributed_modelparallel-1.13.0-cp38-cp38-linux_x86_64.whl
54+
55+
----
56+
57+
Release History
58+
===============
59+
60+
SageMaker Distributed Model Parallel 1.11.0 Release Notes
61+
---------------------------------------------------------
62+
1263
*Date: August. 17. 2022*
1364

1465
**New Features**
@@ -41,12 +92,7 @@ Binary file of this version of the library for `custom container
4192

4293
.. code::
4394
44-
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.12.0/build-artifacts/2022-08-12-16-58/smdistributed_modelparallel-1.11.0-cp38-cp38-linux_x86_64.whl
45-
46-
----
47-
48-
Release History
49-
===============
95+
https://sagemaker-distribu
5096
5197
SageMaker Distributed Model Parallel 1.10.1 Release Notes
5298
---------------------------------------------------------

doc/api/training/smp_versions/latest.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ depending on which version of the library you need to use.
1010
To use the library, reference the
1111
**Common API** documentation alongside the framework specific API documentation.
1212

13-
Version 1.11.0 (Latest)
14-
===========================================
13+
Version 1.11.0, 1.13.0 (Latest)
14+
===============================
1515

1616
To use the library, reference the Common API documentation alongside the framework specific API documentation.
1717

doc/experiments/index.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
############################
2+
Amazon SageMaker Experiments
3+
############################
4+
5+
The SageMaker Python SDK supports to track and organize your machine learning workflow across SageMaker with jobs, such as Processing, Training and Transform, or locally.
6+
7+
.. toctree::
8+
:maxdepth: 2
9+
10+
sagemaker.experiments
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
Experiments
2+
============
3+
4+
Run
5+
-------------
6+
7+
.. autoclass:: sagemaker.experiments.Run
8+
:members:
9+
10+
.. automethod:: sagemaker.experiments.load_run
11+
12+
.. automethod:: sagemaker.experiments.list_runs
13+
14+
.. autoclass:: sagemaker.experiments.SortByType
15+
:members:
16+
:undoc-members:
17+
18+
.. autoclass:: sagemaker.experiments.SortOrderType
19+
:members:
20+
:undoc-members:

doc/index.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,16 @@ Orchestrate your SageMaker training and inference workflows with Airflow and Kub
6060
workflows/index
6161

6262

63+
****************************
64+
Amazon SageMaker Experiments
65+
****************************
66+
You can use Amazon SageMaker Experiments to track machine learning experiments.
67+
68+
.. toctree::
69+
:maxdepth: 2
70+
71+
experiments/index
72+
6373
*************************
6474
Amazon SageMaker Debugger
6575
*************************

doc/overview.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1601,7 +1601,7 @@ see the following documentation:
16011601
- `Protect Data in Batch Transform Jobs by Using an Amazon Virtual Private Cloud <https://docs.aws.amazon.com/sagemaker/latest/dg/batch-vpc.html>`__
16021602
- `Working with VPCs and Subnets <https://docs.aws.amazon.com/vpc/latest/userguide/working-with-vpcs.html>`__
16031603
1604-
You can also reference or reuse the example VPC created for integration tests: `tests/integ/vpc_test_utils.py <tests/integ/vpc_test_utils.py>`__
1604+
You can also reference or reuse the example VPC created for integration tests: `tests/integ/vpc_test_utils.py <../tests/integ/vpc_test_utils.py>`__
16051605
16061606
To train a model using your own VPC, set the optional parameters ``subnets`` and ``security_group_ids`` on an ``Estimator``:
16071607

requirements/extras/test_requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@ requests==2.27.1
2020
sagemaker-experiments==0.1.35
2121
Jinja2==3.0.3
2222
pandas>=1.3.5,<1.5
23+
scikit-learn==1.0.2

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ def read_requirements(filename):
4848
# Declare minimal set for installation
4949
required_packages = [
5050
"attrs>=20.3.0,<23",
51-
"boto3>=1.26.20,<2.0",
51+
"boto3>=1.26.28,<2.0",
5252
"google-pasta",
5353
"numpy>=1.9.0,<2.0",
5454
"protobuf>=3.1,<4.0",

src/sagemaker/amazon/amazon_estimator.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
from sagemaker.deprecations import renamed_warning
2828
from sagemaker.estimator import EstimatorBase, _TrainingJob
2929
from sagemaker.inputs import FileSystemInput, TrainingInput
30-
from sagemaker.utils import sagemaker_timestamp
30+
from sagemaker.utils import sagemaker_timestamp, check_and_get_run_experiment_config
3131
from sagemaker.workflow.entities import PipelineVariable
3232
from sagemaker.workflow.pipeline_context import runnable_by_pipeline
3333
from sagemaker.workflow import is_pipeline_variable
@@ -242,8 +242,8 @@ def fit(
242242
generates a default job name, based on the training image name
243243
and current timestamp.
244244
experiment_config (dict[str, str]): Experiment management configuration.
245-
Optionally, the dict can contain three keys:
246-
'ExperimentName', 'TrialName', and 'TrialComponentDisplayName'.
245+
Optionally, the dict can contain four keys:
246+
'ExperimentName', 'TrialName', 'TrialComponentDisplayName' and 'RunName'.
247247
The behavior of setting these keys is as follows:
248248
* If `ExperimentName` is supplied but `TrialName` is not a Trial will be
249249
automatically created and the job's Trial Component associated with the Trial.
@@ -255,6 +255,7 @@ def fit(
255255
"""
256256
self._prepare_for_training(records, job_name=job_name, mini_batch_size=mini_batch_size)
257257

258+
experiment_config = check_and_get_run_experiment_config(experiment_config)
258259
self.latest_training_job = _TrainingJob.start_new(
259260
self, records, experiment_config=experiment_config
260261
)

0 commit comments

Comments
 (0)