Skip to content

Commit 37d93a9

Browse files
committed
documentation: rewording
1 parent d77586b commit 37d93a9

File tree

2 files changed

+12
-9
lines changed

2 files changed

+12
-9
lines changed

doc/api/training/smd_data_parallel_release_notes/smd_data_parallel_change_log.rst

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,16 @@ SageMaker Distributed Data Parallel 1.6.0 Release Notes
1414

1515
**New Features**
1616

17-
* New SMDDP Collectives support for the SageMaker model parallelism library’s sharded data parallelism operating AllGather. For more information, see `Sharded data parallelism with SMDDP Collectives <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_ in the Amazon SageMaker Developer Guide.
18-
* Added support for Amazon EC2 ml.p4de.24xlarge instances.
17+
* New optimized SMDDP AllGather collective to complement the sharded data parallelism technique
18+
in the SageMaker model parallelism library. For more information, see `Sharded data parallelism with SMDDP Collectives
19+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_
20+
in the *Amazon SageMaker Developer Guide*.
21+
* Added support for Amazon EC2 ``ml.p4de.24xlarge`` instances. You can run data parallel training jobs
22+
on ``ml.p4de.24xlarge`` instances with the SageMaker data parallelism library’s AllReduce collective.
1923

2024
**Improvements**
2125

22-
* Improved general performance of the SMDDP AllReduce collective communication operation.
26+
* General performance improvements of the SMDDP AllReduce collective communication operation.
2327

2428
**Migration to AWS Deep Learning Containers**
2529

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,26 +13,25 @@ SageMaker Distributed Model Parallel 1.13.0 Release Notes
1313

1414
**New Features**
1515

16-
* Sharded data parallelism now supports a new backend for collectives, SMDDP. For supported scenarios
17-
this is used by default for AllGather. For more information, see
16+
* Sharded data parallelism now supports a new backend for collectives called *SMDDP Collectives*.
17+
For supported scenarios, SMDDP Collectives are on by default for the AllGather operation.
18+
For more information, see
1819
`Sharded data parallelism with SMDDP Collectives
1920
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_
20-
in the Amazon SageMaker Developer Guide.
21+
in the *Amazon SageMaker Developer Guide*.
2122
* Introduced FlashAttention for DistributedTransformer to improve memory usage and computational
2223
performance of models such as GPT2, GPTNeo, GPTJ, GPTNeoX, BERT, and RoBERTa.
2324

2425
**Bug Fixes**
2526

26-
* Fixed initialization of lm_head in DistributedTransformer to use a provided range
27+
* Fixed initialization of ``lm_head`` in DistributedTransformer to use a provided range
2728
for initialization, when weights are not tied with the embeddings.
2829

2930
**Improvements**
3031

3132
* When a module has no parameters, we have introduced an optimization to execute
3233
such a module on the same rank as its parent during pipeline parallelism.
3334

34-
35-
3635
**Migration to AWS Deep Learning Containers**
3736

3837
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):

0 commit comments

Comments
 (0)