Skip to content

Commit 250990f

Browse files
committed
rm white spaces
1 parent 95d58e5 commit 250990f

File tree

2 files changed

+30
-30
lines changed

2 files changed

+30
-30
lines changed

doc/api/training/smd_model_parallel_release_notes/smd_model_parallel_change_log.rst

+21-21
Original file line numberDiff line numberDiff line change
@@ -18,33 +18,33 @@ SageMaker Distributed Model Parallel 1.15.0 Release Notes
1818

1919
**New Features**
2020

21-
* Using sharded data parallelism with tensor parallelism together is now
22-
available for PyTorch 1.13.1. It allows you to train with smaller global batch
23-
sizes while scaling up to large clusters. For more information, see `Sharded
24-
data parallelism with tensor parallelism <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-with-tensor-parallelism>`_
21+
* Using sharded data parallelism with tensor parallelism together is now
22+
available for PyTorch 1.13.1. It allows you to train with smaller global batch
23+
sizes while scaling up to large clusters. For more information, see `Sharded
24+
data parallelism with tensor parallelism <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-with-tensor-parallelism>`_
2525
in the *Amazon SageMaker Developer Guide*.
26-
* Added support for saving and loading full model checkpoints when using sharded
27-
data parallelism. This is enabled by using the standard checkpointing API,
28-
``smp.save_checkpoint`` with ``partial=False``.
29-
Before, full checkpoints needed to be created by merging partial checkpoint
30-
files after training finishes.
31-
* `DistributedTransformer <https://sagemaker.readthedocs.io/en/stable/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.html#smdistributed.modelparallel.torch.nn.DistributedTransformerLayer>`_
32-
now supports the ALiBi position embeddings.
33-
When using DistributedTransformer, you can set the ``use_alibi`` parameter
34-
to ``True`` to use the Triton-based flash attention kernels. This helps
26+
* Added support for saving and loading full model checkpoints when using sharded
27+
data parallelism. This is enabled by using the standard checkpointing API,
28+
``smp.save_checkpoint`` with ``partial=False``.
29+
Before, full checkpoints needed to be created by merging partial checkpoint
30+
files after training finishes.
31+
* `DistributedTransformer <https://sagemaker.readthedocs.io/en/stable/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.html#smdistributed.modelparallel.torch.nn.DistributedTransformerLayer>`_
32+
now supports the ALiBi position embeddings.
33+
When using DistributedTransformer, you can set the ``use_alibi`` parameter
34+
to ``True`` to use the Triton-based flash attention kernels. This helps
3535
evaluate sequences longer than those used for training.
3636

3737
**Bug Fixes**
3838

39-
* When using tensor parallelism, parameters were initialized multiple times
39+
* When using tensor parallelism, parameters were initialized multiple times
4040
unncessarily. This release fixed the multiple initialization of parameters
41-
so that each parameter is initialized exactly once.
42-
It not only saves time, but also ensures that the random generator behavior
41+
so that each parameter is initialized exactly once.
42+
It not only saves time, but also ensures that the random generator behavior
4343
is similar to the non-tensor parallelism case.
44-
44+
4545
**Known issues**
4646

47-
* Model initialization might take longer with PyTorch 2.0 than that with PyTorch 1.13.
47+
* Model initialization might take longer with PyTorch 2.0 than that with PyTorch 1.13.
4848

4949
**Migration to AWS Deep Learning Containers**
5050

@@ -57,9 +57,9 @@ This version passed benchmark testing and is migrated to the following AWS Deep
5757
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:2.0.0-gpu-py310-cu118-ubuntu20.04-sagemaker
5858
5959
- SageMaker training container for PyTorch v1.13.1
60-
60+
6161
.. code::
62-
62+
6363
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.13.1-gpu-py39-cu117-ubuntu20.04-sagemaker
6464
6565
Binary file of this version of the library for `custom container
@@ -68,7 +68,7 @@ Binary file of this version of the library for `custom container
6868
- For PyTorch v2.0.0
6969

7070
.. code::
71-
71+
7272
https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-2.0.0/build-artifacts/2023-04-14-20-14/smdistributed_modelparallel-1.15.0-cp310-cp310-linux_x86_64.whl
7373
7474
- For PyTorch v1.13.1

doc/api/training/smp_versions/latest/smd_model_parallel_pytorch_tensor_parallel.rst

+9-9
Original file line numberDiff line numberDiff line change
@@ -301,20 +301,20 @@ Tensor Parallelism Module APIs
301301
``post_layernorm`` must be ``True``.
302302
- ``post_layernorm``: If ``True``, inserts layer normalization at
303303
the output. At least one of ``pre_layernorm`` and
304-
``post_layernorm`` must be ``True``.
305-
- ``use_alibi`` (bool, default False): Activates Attention with
304+
``post_layernorm`` must be ``True``.
305+
- ``use_alibi`` (bool, default False): Activates Attention with
306306
Linear Biases (ALiBi) for attention computation.
307-
ALiBi facilitates efficient extrapolation on input sequences
308-
and thus improves training efficiency.
309-
The library enables ALiBi by using the `Triton
307+
ALiBi facilitates efficient extrapolation on input sequences
308+
and thus improves training efficiency.
309+
The library enables ALiBi by using the `Triton
310310
flash attention kernel
311311
<https://github.com/HazyResearch/flash-attention>`_.
312-
Refer to https://arxiv.org/abs/2108.12409 for more
312+
Refer to https://arxiv.org/abs/2108.12409 for more
313313
details on the technique.
314-
(Available from
314+
(Available from
315315
the SageMaker model parallelism library v1.15.0.)
316-
- ``alibi_bias_max`` (int, default 8): Defines the ALiBi base
317-
value for mask generation. (Available from
316+
- ``alibi_bias_max`` (int, default 8): Defines the ALiBi base
317+
value for mask generation. (Available from
318318
the SageMaker model parallelism library v1.15.0.)
319319

320320
- **Methods:**

0 commit comments

Comments
 (0)