Skip to content

Commit 9ae2b85

Browse files
committed
documentation: smdistributed libraries release notes
1 parent eef679c commit 9ae2b85

File tree

4 files changed

+94
-18
lines changed

4 files changed

+94
-18
lines changed

doc/api/training/sdp_versions/latest.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ depending on the version of the library you use.
2626
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-use-python-skd-api>`_
2727
for more information.
2828

29-
Version 1.4.0, 1.4.1, 1.5.0 (Latest)
30-
====================================
29+
Version 1.4.0, 1.4.1, 1.5.0, 1.6.0 (Latest)
30+
===========================================
3131

3232
.. toctree::
3333
:maxdepth: 1

doc/api/training/smd_data_parallel_release_notes/smd_data_parallel_change_log.rst

+36-7
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,44 @@ Release Notes
77
New features, bug fixes, and improvements are regularly made to the SageMaker
88
distributed data parallel library.
99

10-
SageMaker Distributed Data Parallel 1.5.0 Release Notes
10+
SageMaker Distributed Data Parallel 1.6.0 Release Notes
1111
=======================================================
1212

13+
*Date: Dec. 15. 2022*
14+
15+
**New Features**
16+
17+
* New SMDDP Collectives support for the SageMaker model parallelism library’s sharded data parallelism operating AllGather. For more information, see `Sharded data parallelism with SMDDP Collectives <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_ in the Amazon SageMaker Developer Guide.
18+
* Added support for Amazon EC2 ml.p4de.24xlarge instances
19+
20+
21+
**Migration to AWS Deep Learning Containers**
22+
23+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
24+
25+
- SageMaker training container for PyTorch v1.12.1
26+
27+
.. code::
28+
29+
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker
30+
31+
32+
Binary file of this version of the library for `custom container
33+
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-bring-your-own-container>`_ users:
34+
35+
.. code::
36+
37+
https://smdataparallel.s3.amazonaws.com/binary/pytorch/1.12.1/cu113/2022-12-05/smdistributed_dataparallel-1.6.0-cp38-cp38-linux_x86_64.whl
38+
39+
40+
----
41+
42+
Release History
43+
===============
44+
45+
SageMaker Distributed Data Parallel 1.5.0 Release Notes
46+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47+
1348
*Date: Jul. 26. 2022*
1449

1550
**Currency Updates**
@@ -38,12 +73,6 @@ Binary file of this version of the library for `custom container
3873
3974
https://smdataparallel.s3.amazonaws.com/binary/pytorch/1.12.0/cu113/2022-07-01/smdistributed_dataparallel-1.5.0-cp38-cp38-linux_x86_64.whl
4075
41-
42-
----
43-
44-
Release History
45-
===============
46-
4776
SageMaker Distributed Data Parallel 1.4.1 Release Notes
4877
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4978

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

+54-7
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,61 @@ New features, bug fixes, and improvements are regularly made to the SageMaker
66
distributed model parallel library.
77

88

9-
SageMaker Distributed Model Parallel 1.11.0 Release Notes
9+
SageMaker Distributed Model Parallel 1.13.0 Release Notes
1010
=========================================================
1111

12+
*Date: Dec. 15. 2022*
13+
14+
**New Features**
15+
16+
* Sharded data parallelism now supports a new backend for collectives, SMDDP. For supported scenarios
17+
this is used by default for AllGather. For more information, see
18+
`Sharded data parallelism with SMDDP Collectives
19+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_
20+
in the Amazon SageMaker Developer Guide.
21+
* Introduced FlashAttention for DistributedTransformer to improve memory usage and computational
22+
performance of models such as GPT2, GPTNeo, GPTJ, GPTNeoX, BERT, and RoBERTa.
23+
24+
**Bug Fixes**
25+
26+
* Fixed initialization of lm_head in DistributedTransformer to use a provided range
27+
for initialization, when weights are not tied with the embeddings.
28+
29+
**Improvements**
30+
31+
* When a module has no parameters, we have introduced an optimization to execute
32+
such a module on the same rank as its parent during pipeline parallelism.
33+
34+
35+
36+
**Migration to AWS Deep Learning Containers**
37+
38+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
39+
40+
- SageMaker training container for PyTorch v1.12.1
41+
42+
.. code::
43+
44+
763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker
45+
46+
47+
Binary file of this version of the library for `custom container
48+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-sm-sdk.html#model-parallel-bring-your-own-container>`_ users:
49+
50+
- For PyTorch 1.12.0
51+
52+
.. code::
53+
54+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.12.1/build-artifacts/2022-12-08-21-34/smdistributed_modelparallel-1.13.0-cp38-cp38-linux_x86_64.whl
55+
56+
----
57+
58+
Release History
59+
===============
60+
61+
SageMaker Distributed Model Parallel 1.11.0 Release Notes
62+
---------------------------------------------------------
63+
1264
*Date: August. 17. 2022*
1365

1466
**New Features**
@@ -41,12 +93,7 @@ Binary file of this version of the library for `custom container
4193

4294
.. code::
4395
44-
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.12.0/build-artifacts/2022-08-12-16-58/smdistributed_modelparallel-1.11.0-cp38-cp38-linux_x86_64.whl
45-
46-
----
47-
48-
Release History
49-
===============
96+
https://sagemaker-distribu
5097
5198
SageMaker Distributed Model Parallel 1.10.1 Release Notes
5299
---------------------------------------------------------

doc/api/training/smp_versions/latest.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ depending on which version of the library you need to use.
1010
To use the library, reference the
1111
**Common API** documentation alongside the framework specific API documentation.
1212

13-
Version 1.11.0 (Latest)
14-
===========================================
13+
Version 1.11.0, 1.13.0 (Latest)
14+
===============================
1515

1616
To use the library, reference the Common API documentation alongside the framework specific API documentation.
1717

0 commit comments

Comments
 (0)