Skip to content

Commit 0eade55

Browse files
authored
documentation: SM model parallel library v1.15.0 release note (#3806)
1 parent 9dbead8 commit 0eade55

File tree

3 files changed

+94
-8
lines changed

3 files changed

+94
-8
lines changed

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

+78-6
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,88 @@ Release Notes
33
#############
44

55
New features, bug fixes, and improvements are regularly made to the SageMaker
6-
distributed model parallel library.
6+
model parallelism library.
77

88

9-
SageMaker Distributed Model Parallel 1.14.0 Release Notes
9+
SageMaker Distributed Model Parallel 1.15.0 Release Notes
1010
=========================================================
1111

12+
*Date: Apr. 27. 2023*
13+
14+
**Currency Updates**
15+
16+
* Added support for PyTorch v2.0.0.
17+
Note that the library does not support ``torch.compile`` in this release.
18+
19+
**New Features**
20+
21+
* Using sharded data parallelism with tensor parallelism together is now
22+
available for PyTorch 1.13.1. It allows you to train with smaller global batch
23+
sizes while scaling up to large clusters. For more information, see `Sharded
24+
data parallelism with tensor parallelism <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-with-tensor-parallelism>`_
25+
in the *Amazon SageMaker Developer Guide*.
26+
* Added support for saving and loading full model checkpoints when using sharded
27+
data parallelism. This is enabled by using the standard checkpointing API,
28+
``smp.save_checkpoint`` with ``partial=False``.
29+
Before, full checkpoints needed to be created by merging partial checkpoint
30+
files after training finishes.
31+
* `DistributedTransformer <https://sagemaker.readthedocs.io/en/stable/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.html#smdistributed.modelparallel.torch.nn.DistributedTransformerLayer>`_
32+
now supports the ALiBi position embeddings.
33+
When using DistributedTransformer, you can set the ``use_alibi`` parameter
34+
to ``True`` to use the Triton-based flash attention kernels. This helps
35+
evaluate sequences longer than those used for training.
36+
37+
**Bug Fixes**
38+
39+
* When using tensor parallelism, parameters were initialized multiple times
40+
unncessarily. This release fixed the multiple initialization of parameters
41+
so that each parameter is initialized exactly once.
42+
It not only saves time, but also ensures that the random generator behavior
43+
is similar to the non-tensor parallelism case.
44+
45+
**Known issues**
46+
47+
* Model initialization might take longer with PyTorch 2.0 than that with PyTorch 1.13.
48+
49+
**Migration to AWS Deep Learning Containers**
50+
51+
This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):
52+
53+
- SageMaker training container for PyTorch v2.0.0
54+
55+
.. code::
56+
57+
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.0.0-gpu-py310-cu118-ubuntu20.04-sagemaker
58+
59+
- SageMaker training container for PyTorch v1.13.1
60+
61+
.. code::
62+
63+
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.13.1-gpu-py39-cu117-ubuntu20.04-sagemaker
64+
65+
Binary file of this version of the library for `custom container
66+
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-sm-sdk.html#model-parallel-bring-your-own-container>`_ users:
67+
68+
- For PyTorch v2.0.0
69+
70+
.. code::
71+
72+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-2.0.0/build-artifacts/2023-04-14-20-14/smdistributed_modelparallel-1.15.0-cp310-cp310-linux_x86_64.whl
73+
74+
- For PyTorch v1.13.1
75+
76+
.. code::
77+
78+
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.13.1/build-artifacts/2023-04-17-15-49/smdistributed_modelparallel-1.15.0-cp39-cp39-linux_x86_64.whl
79+
80+
----
81+
82+
Release History
83+
===============
84+
85+
SageMaker Distributed Model Parallel 1.14.0 Release Notes
86+
---------------------------------------------------------
87+
1288
*Date: Jan. 30. 2023*
1389

1490
**Currency Updates**
@@ -39,10 +115,6 @@ Binary file of this version of the library for `custom container
39115
40116
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.13.1/build-artifacts/2023-01-19-18-35/smdistributed_modelparallel-1.14.0-cp39-cp39-linux_x86_64.whl
41117
42-
----
43-
44-
Release History
45-
===============
46118
47119
SageMaker Distributed Model Parallel 1.13.0 Release Notes
48120
---------------------------------------------------------

doc/api/training/smp_versions/latest.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ depending on which version of the library you need to use.
1010
To use the library, reference the
1111
**Common API** documentation alongside the framework specific API documentation.
1212

13-
Version 1.11.0, 1.13.0, 1.14.0 (Latest)
14-
=======================================
13+
Version 1.11.0, 1.13.0, 1.14.0, 1.15.0 (Latest)
14+
===============================================
1515

1616
To use the library, reference the Common API documentation alongside the framework specific API documentation.
1717

doc/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.rst

+14
Original file line numberDiff line numberDiff line change
@@ -302,6 +302,20 @@ Tensor Parallelism Module APIs
302302
- ``post_layernorm``: If ``True``, inserts layer normalization at
303303
the output. At least one of ``pre_layernorm`` and
304304
``post_layernorm`` must be ``True``.
305+
- ``use_alibi`` (bool, default False): Activates Attention with
306+
Linear Biases (ALiBi) for attention computation.
307+
ALiBi facilitates efficient extrapolation on input sequences
308+
and thus improves training efficiency.
309+
The library enables ALiBi by using the `Triton
310+
flash attention kernel
311+
<https://github.com/HazyResearch/flash-attention>`_.
312+
Refer to https://arxiv.org/abs/2108.12409 for more
313+
details on the technique.
314+
(Available from
315+
the SageMaker model parallelism library v1.15.0.)
316+
- ``alibi_bias_max`` (int, default 8): Defines the ALiBi base
317+
value for mask generation. (Available from
318+
the SageMaker model parallelism library v1.15.0.)
305319

306320
- **Methods:**
307321

0 commit comments

Comments
 (0)