You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that the library does not support ``torch.compile`` in this release.
18
+
19
+
**New Features**
20
+
21
+
* Using sharded data parallelism with tensor parallelism together is now
22
+
available for PyTorch 1.13.1. It allows you to train with smaller global batch
23
+
sizes while scaling up to large clusters. For more information, see `Sharded
24
+
data parallelism with tensor parallelism <https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-with-tensor-parallelism>`_
25
+
in the *Amazon SageMaker Developer Guide*.
26
+
* Added support for saving and loading full model checkpoints when using sharded
27
+
data parallelism. This is enabled by using the standard checkpointing API,
28
+
``smp.save_checkpoint`` with ``partial=False``.
29
+
Before, full checkpoints needed to be created by merging partial checkpoint
0 commit comments